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Fig. 3a-c 
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Fig. 3d-e 
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Figure 4 

Genomic sequence of the HtH1 gene 



SIGNAL PEPTIDE SEQUENCE 1S-1 (1st part) 
GGCTTGTTCAGTTTCTACTCGTCGCCCTTGTG 
INTRON 1S-1/1S-2 (SEQ ID NO: 109) 

GTAAGTCAACGTCTTTGTTTTAAGTTTGATGCATATCTATCATTGCGTTTTAAAATACCA 
TTACAACCAACGTGTCTCTATTGGTCTTCACCTGTTTAACGTATATATTGTTTTTAATGT 
GAAAAT C T GAGAT TAT T T T CAT T T C C G T C AAT AT T C G T AAAAT AC T AT ACAAATAAAAT T 
GCTTCAGCCTATTGCATTGGCAGTTTTCGCAGAATAACGAGGGAAGGCGTACATAAAATA 
T AAAC C AG T G TAT AT T C AAG CAT G T T TAT AAT TTC T T TATAGAT T AT AAC AT CAT AT C AA 
AAC AC C AAT C T G GAT T T AAAC C C G T G AAT C C AAAG TAT AC C AAT T AAC G G AAC T T T AT C A 
TGTTTTATCAAAGGTTTTAGATGAGGGTAAAGAAGTCCGAGCTATATTTTGCGATATCAG 
CAAAGCCTTCTATCACGTCTTGCACACAGGGCTGGTATCTAAACTCGAATCCACAGGAAT 
AAAT AT T T C AG C C G AT AGAGAAC AG TCGGTGGC TAT CAT T G G T C AC AAAAC AAG T C C AAA 
ATCTGCATTAGCCGGTGTTCCCCAAGGCTCTGTCTTGGGGCCACTATTATTTCTCACCTA 
T A T AAAC GAT T C AAC T AAT G G AAT AT AAAG C AAC G T AAAC C T C AC C GC AG AT GAAAC AC T 
AAG T T AT AGAC AAT C C G T T T AAAAC C C AG C C AC T G C T T AAT AAT G AC T TAG GCCGTCTTT 
CAGACTGGGCTAGTAAGCGGCAGGTTAAATTTCACCTTGAAAAGACAGAAACCATGGTAT 
AT T T C AAAAAC AC GAAT G C AAG T C C T AAAC T T C AAC T AC T AC T T GAT GAT AC T G G GAT T T 
C T AAAG T G T G T G AAC AAAAAC AC AT T G G C C T GAT C C T AC AAG AT AAC C AG AC AGAAAC C A 
TGTTTTTTTT C AAT AAC AC GAAT G C AAG T C C T AAAC T T C AAC T AC T AC T T GAT GAT AC T G 
GGATTTCTAAAGTGTGTGGTGAACACAAACACCTTGGCCTGATTCTGCAAGATAATGGAA 
AATGTCAGAAACATAAGCAAGTTGATGTGGGGTTTTCTGGGGGTTGTGACAACACCGAAA 
GACCCTGCAACTAATGTTAGCTCAAAGGGTTTTACACCCGGTCACAAGTGGGGATCGACC 
CAGGCACCTTTTGCCTTTGACAGCTCGCCTTTCAAAAAATCTCAATTCGAAAACGAAATC 
TAATAATTTCATGAGCGATACAACCGTTTTTCATAATGCTGTGGTACCGCATACTGTGGA 
AAC AT CTGTCTACCCATTTGG TAG T C C C C CAT AAAAT G T AT T TAT G T T TAT AAAC AC AAT 
GTTTATAGGGTTACAGTTAGAAGAAGCATTTCTATTGGCTAATGTACATTGCTTGTTTTT 
AC T AT T G T G C AAAG G CAT AT T AC AG G T C T T T TAG GAAAT T AAAT AC T G T T T AAAT C AC AT 
ACACTACCGGTAATCCTATTATGCTTATCCTGCCAACATTCTGCCCAAGCAAACGCATGA 
AAG T T AAAG C T GAG T G T AAAAT AC T GAT TGCTGTGT T AC T T C AC AAC C AG T G G AC T GAAT 
ACAACCATGTTTTTTCTTGAAAGTCACAAACATCCAGTCGGTTTCTAATGTGTTAAGTTT 
C TAG T T T CAT AAAG AG CAT G AC G T AAT G G T GAAT AG GAG T TAT C AAT G T T T C TAT C T AAT 
GACTCCTAGTTCGTTACTTTTTTAATAAAACATCCATGTGTTTAATGTTTGGCCACAGAT 
ATAACAAGAAAGAAATCGGATAAAATCTACATTTTGACCAATCGGAAGGCTGCCCCCTCC 
C T AAT C C T AAT CAT TTTTGTGCCT C AAAAC AT AC T C AAC C AG AC AT T T GAAC TATGTATA 
TAT C AG AAT GAAAT G G T AAC AAT AAAC TTGTATGTT G AC C AG AC AGAAT TAG G G T GAAT C 
TGAATACCAACTATTGTCACATATGAATATGGATAAGCTCTGCGCGTGCGTGCGGGCGGT 
GTAGTGCGTGTGTGTCTGTGTGTGTGTGTGTGTGCGTTTGTGTGTGTGTCTGCGTGCGTG 
TGTGTGCGCGTGTGTGCGTGTGTGTGTGTGTGCAGTGTGCCGAGTGTGTGTGTGTGTGTG 
T G C AC AG AC AT G T G G T T GAG AC AC AC T T GAT T C AG T G C AG GAT TATGTCCTT C AAC C GAG 
TGTAGTCTTTAAGTGTGCCTGGAAACAAAAAACTGCGTTGGGTTGCATCGCCTCTGTAGC 
AAGCTTGGACGCGTCACGCAGCTCTGATACCACGTATTGGCACCATGTTTCATCGGTCTC 
ACGCGAATATTATGCTATGTGTGGCGTATCATACCATAGGTTGGGAACGTTTCAATACTG 
TACCGAGCTTGGGCGTGTCACAAAGCTATGATAAGATGACAACACGTCTTGGCATCTTGT 
TTCCTCGGTATCACGCGCTGTTATGCTATGTGTGGCTATCACACCTTAGGTTGGGAAAGT 
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TTCCACATTTTCCAGCCTCGTACATGTTTCCTTTTGTTTTTTCCTTAGTTATCAGCATAC 
C G TAT AT T C TAT AT T T AAT GAG C AT T T G TAT T T T T C T AC AG 

SIGNAL PEPTIDE SEQUENCE 1S-2 (2nd part) 

gtgggggctggagcag 

intron 1s-2/1a-1 (seq id no: 110) 

gtgagtttcttaacattgtcatggtacatggatatacgctcagtgggaaagcaggatatc 
cccttggttcaagtattcacttgtcacgccaagtgttcgattcccaacatggaatactgt 
cat at ag t aaat t gat ac ac t ac t t ac at t t aat t c t c c ac t aaac g t c aac g t c c t t t a 
cttcatggcccacatggtccgtattagtgagtgagtgagtcagggcataagtatttaacg 
t c aaat c agcaat at t t c ag c cat at t gt gacaagaat t gaat at aaat aat tat ac t t a 
t aat gc t tat aaat at aaat t at at aaatac c tat aac tat aaat tag t tat ac tag tat 
t tat c aaaac at at t t g c c ac g ac ac t g c ac g c c gat ac t t c aag t g t c t t c ac c t caag 
cgtgtaactcctcatactctgtaataagtatgtacactaagtgagtgctatcatctccat 
g c t t c at tag t t t c g t c ag at gc g t g tat c cat ac gag t ac at t c agat tat g g gat c c a 
gagctttcttatctcaagtatttccgattgtaaagccatactacttccccaatgactgac 
gagacagatggcaaccgttctttcctcctgactaggtgagtgccactgataaatcattat 
gcctttaacattaggaatgttagcagtgcacatgtttcagaattgcgaccttatggttgt 
aaag at t ac aaac t t t ac aac t t ac t t gag ac ag g t t c cat at g t c g tat c t g aaat ag t 
gtgaaggtatctgattcgatgcaatacacagacatataaacatattgtcgccctgctatt 
c c g gaaag g t cat tttgtatg t aac g t t c c t t aat g g ac ac aaac g gaat tat tag t t aa 
acatactcaacaaaactatgttattttgcaatgggtagcaccgaaatctaccgacagtgg 
ttcgtaaaagtagaacattctgacataaagaaaaatcattggctttaaatatatgcaag't 
tacttgtctctaacaaccagttttatacacatttcagagaacggggaatccgcgatgaca 
atatcaacgagtatatacagaatatataattaaaaacgatgagtgcctggcaagggaaag 
agcgagatttgccaaacaggggggtggtgttgagcttgaatcgtggagaaacgtagattg 

AAAGACAAGATGACATCTAATGATCCGAAAATCAAACACAGGATTAACTGGGATGCAGAA 
GAAT GAAT AT C T CAAG CAT AC AT GC AAC AC T T CAT GAAT G CAT C T C AAAC AT T T T C G T C A 
GATCGGATGCATGAAGATTTGTAAAGCAATGGTTTAAATTGTCCCTAAACGTTTAGTTGG 
AGATGTATGAGGCTAGGCTGTATGTTGAACGAAACCATTTAACATTGTTGTTCATGATTA 
T T T AAT AT T T T T T CAT T T TAT AG AT G T AC AAT AAAAT T G GAAAC T AAAC AT TTCCCTTTA 
TTGTTTTGTATTTACCTGTTCATGGGTATGTTTTGAAAGATCGTGATATTTAGTTGGCAT 
TCACAAGTTGGAAAAAGGTCACTCAGTTTGATTTCAAGTTTATGTAACCTCTTTATCTGA 
CGCTCCAAAATATGTATAGCCTTGTTCATCTGTCGGTATGTGGATATTCCTACTTCAGGG 
T AGG G TAG CAT T AAT AC T T AC AAAAC AT AAC G T G T AC C AG AT T T C AG T C AC C T C AG AG AT 
GAT AAT G C AT G T C GAT AT GAT AG G T C AAAAC T T T C GAT AT C AAT C AC AAT G AAC C TAT G G 
AC C C T GAAT C G GAAT GAT AC G T T AC AC T T T AGAAAC AAT T C AC AAAT AT G AC T G T C AC C C 
TTTCAGGTAATAATGTTTGACGGACTACGATAGTGCTGAACAGCAGGAGAGGCAACATGG 
TTCGATTGTGAGACAGGTTTAGTGTATTTGTTTGCGAATTTAAGGTTCTGAATCACAATA 
GAC AC GG T T C AG T T AAT GGAT AAAC C AAT CAT T AGAT AG AT AGAGAT TAG T C GC GAT AT T 
GCTGGGATAAAGCTTAGTGGGACGTTAAGTCCCATCTCAATCTCTCTCATTTTTTCCAAA 
ACAGTTTTAATTCAGGCTCATGACAAGGTCGTACTGTTGCAAAGGATTCTACTTCAAGCA 
GAGATGTCTCATGAATACAGTACAGGGTTTTTGAAGTTTATCCAGTGCAGCGCTGGCACC 
ATCTCTGCATGCGAATTATACCATCCATGCCGCTCTAGGCTATTTGTATTAAGTCTGTAG 
AATTAAATTCGCGAGTTGCAAATACTGCTCACCATTATCTGCCTCAACCCAGTTTGGGTA 
CAT GC GAT T T AC AC AAT AT T AT G T AT AA TGTTCGCTTTTC GAAAAC AAAAC AC C T AAAT T 
CAT C C AAAG T T T T GG G AG AT T T T AT T C GAG AAAT C AAC C T GAG AT G T T GAAT C G G GAG C T 
GCGCTTATTCAATGGTGGACTCGGAAGGGAAGTAACCGCTGATGAGGCAAAACAATAACG 
CAAACATATGGAAGTGGAACTCTTTGAACCAGTATTATGTTTGTGTGGACATGTATGTGT 
T AAT T T GAC CAT T C GAACAAC T T T AC TAT T C TAT T CAT AAT G T G T T T AGAT T T AC AT T T G 
AAT T AAAAG AG AT GAG T T T AAG AT AT T AAT AT TTTCCTTT TAT AG TCTGTCGT GAT T G T A 
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GGGCAATATTTATGTATGTTCGTTCATTTTTCATTTATCATTTGGAAAGGTATATCATAA 
GAT TAT TAT TAT C AT T C T T G AAG T AAT G TAT AC AT AT AT AT AT G T C T T GAG TAG C T TAT T 
T T C AAT T TAT TAT CAT C C G T CAT C C AAT T T T AT T T C AC G AAAG T AT AAG AAAT AAC GAGA 
GAGAGAGAGAGAGAGAGAGAAAAGACAGAAATGAAGTTAGGAGATATNAGTTATCAAGAA 
AAC AACAG T T T GAAT TTTTTGTT T AGACAAG AT AT CAT AT CAAT AAC C T C G C AC TAT T AC 
GGGAATAGGCGGGCGTTCCATATGCACAATGAATCGTCAGTTAAAATCAACATTAAACTT 
AAAAT AC T C C T CAT AT T T AAAG T T GAT C T AC CTCTTGTAT TAT T G TAG AC TAT TAG AC AG 
AAGTCGACAGTGACACCAGCAACCAGATATCATACCCAGACTTAAAAAGCTGTTTCCTTG 
ATGTTTCAATTTATTTCCATTTCCATTATTTCCCTTTATTGGTTTCCATTTATCAAACTT 
AC CAT C T GC AC C AG T G G GAG AT T GAT AT GTTGTATTTATT TAT AT T T C T T G T AC T AC AAT 
AT C AAGAAT G TAT AG GAG C TAT TCCTTGTTCC T AAAAC C GG AT AG AT C CAT AAT T T C CAT 
T T TGGGATAAAT G G AAAC T AAAC ACAAC T T T T AC AG T AAAC AC GAG T GAG C AAG T T GAG T 
TTTACGCCGTTTTTAGTAGTATTCCAGCAATATCGCGGCGGGGGACACCAGAAATGGGCT 
TCACACAGTGAATGCATGTGGGGATTCGAACCCGGGTCTTCGGCGTGACGAGTGAACGCT 
T TAG C C AC TAG G C T AC C C C AC C GC C TAT T TAT AG T T AAGAC GAAT AC T T T T C T C AAG C C T 
CAAATATGTCCATTCTAGAGAGACTGAATCTGATCCTGAATCTGCGGACCGGTCTTGAAT 
AT CAT C C C AC T AAC T CAT T G T ACAAAG T AC C T GT AG AT T G T C AG T T C AAAGAC AGAT T T C 
ACAAC C C T AT TAT AT TTTGTCCTGCT CAT T AAGAT AT T C AG AC T C AC T C AAAC T G C T AAA 
TGATTTTAATCCTACTTTGAGATGTTTTAACTTTTATTCGATGCATTTTTGCGTTCTGCG 
TCCTGTATAAAGGTAAAGCAGGTAAACTAACCTAACCTGTTGATTTATTTCATAGTTTTG 
C GAT C AG AT T GAAAC C G GAAT G C AC AG T GAAG T G T G G CAT AC AT C T T T C C AC AG AG AT AC 
TGGATACTAGGTGGTACAACCGCATTGGCTTTGTGAAAGGATATTAGTGTTTTATGAGAC 
TGACTCATGTTTCAATGCTTAGAGCGGAATGATCTCGGTCTTCATGAAAAATATTGTGTT 
GAAGTAACCCCCCAGTCCCTAACAGAACGTGGGGAAAGCAGATGGATATGCCAAGACATC 
TTCGCATGGTGTGAAGATGATCGTTACAACATCTGCAGAAAAAGTTATTTCTGTGAAGAA 
TATGCCAAAGCATCACTGTGAGTGTTTTGAAGATGTGATATGGCAACACGCAGCGTGTAA 
TTATGCTTTGTGTGTATTTCTGAAGATCCGTATGAGCATGGCGCCAAACTATCAGTTAAA 
TGGCTATGCGAAGATCTTCCCGAGATGGTAAACACATATTTTGGCCATTTTCTTTGTAAG 
TGGGCGACACAGAAGATCCCCCTGATTGTGTGGATGAGGACACAAAAACGGGTCCCCCTT 
CCTTTGCTGATGCTAATGACGCCCTGGAAACATTGAAAGACTTCTTCTCCAGCAAGCAAG 
C C AC C AAC C AC AAG T T G TAT AAAT C G C T T G C GG AC T T GAAT AC G GC AG T T G G AC AG AT AC 
ATACAGCCAGAGAGGGCCGAACTAAAACATCTAAACATGGAAAAACTGTAAAGACAGGCT 
TTGTTGTACGACGTACGTAAATTCATTGAATGTTTGAAAAGGTAGAAAATTATTAAATCT 
TTGAAACCTCGCTCTGTTTGTTTGTTATTGTCCCCCACATTTGCAAATGGTATCCAAAAA 
GGGCAGACACATTTGTTTTAATCTTAGCCAGGTTCAATTTAGCCTTGCGCCCAGACTCAT 
TGTATCTGGTGAAGGCTATAGGTGGCCACGTCTTCTAAGATGCTATGCTATTCTTACCAG 
AATCCAATGTAAAGAGTTCAAACGCATGGTTCGCTTTGATTGTGATTCTTTCTTAGCACC 
TCTCTCCTACCCAGAGTTCACCTGCACTGCTCCTGACTCACAATAAGCTGACGTGCTGTC 
ATATATGTGCAACATTGTATACGTTGGCGTTAAGCCCAACTCACTTCCGCTGTCTTTTGG 
CAG 

DOMAIN 1A-1 (1st part of domain a) 

ACAACGTCGTCAGAAAGGACGTGAGTCACCTCACAGTTGACGAGGTGCAAGCTCTTCACG 
GCGCCCTCCATGACGTCACTGCATCTACAGGGCCTCTGAGTTTCGAAGACATAACATCTT 
ACCATGCCGCACCAGCGTCGTGTGACTACAAGGGACGGAAGATCGCCTGCTGTGTCCACG 
GTATGCCCAGTTTCCCCTTCTGGCACAGGGCATATGTCGTCCAAGCCGAGCGGGCACTGT 
TGTCCAAACGGAAGACTGTCGGAATGCCTTACTGGGACTGGACGCAAACGCTGACTCACT 
T AC CAT C T C T T G T G AC T G AAC C CAT C T AC AT T GAC AG T AAAG G T G GAAAG 

INTRON 1A-1/1A-2 (SEQ ID NO: 111) 

G T AAC T AC AAAC G T C G T C C CAT T CAT AC AG GAG AAAT AT AC AAT T G T G T T G T AAG AG C G G 
TATACTGTTTGCCAACTGTGTAATTGAAACGTTGATGATGGTGTCTTTGTATTTCAATTT 
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GTATGCACTTAGACATGATCAATGTTTCTGATGTGTCAAGGATGTTCGGTGTGTCACTTT 
C AAAAG AT C AAAT T CAT AT G AC G T AC AC AG AG C AAG AAC C AAC AG T AAGAAG TCTGTATG 
ACTTCGCTCTTAAAAGCAATGGAAAAATATTTTCACTTAACACCTAGCCCATAATCACGC 
AT AT TAG AT TAT T C AAG C GAT G T C AAC AT G T T T T T AAT AT C AAT C T CAT G G T T C T GAT AT 
TACCGGAGACATGCAACAGGCTGCCATTATAGCCAGGAAATCTTATGAATATGTGCATAT 
TTTTTCTTTGATTCTGTATGACGAGAAATATTCGGAGGCAAAGATTGTGTTTTCAGAACA 
GAATCAGGGTATCAGTGACATCGTCACTGCATGGCTACAATATTGCTGATGTGACTGTTT 
CTCCAAGGATTTTCATCTCACTGTCTGTACTTTGAATCTACAAATTCGTATTAAAGTTAT 
GAC AAT T T T AC CCCTGCCTATTTG T AAAC G AAAT AT AAC AT GAG TGTTTATGCT G AC AG 

DOMAIN 1A-2 (2nd part of domain a) 

GCTCAAACCAACTACTGGTACCGCGGCGAGATAGCGTTCATCAATAAGAAGACTGCGCGA 
GCTGTAGATGATCGCCTATTCGAGAAGGTGGAGCCTGGTCACTACACACATCTTATGGAG 
ACTGTCCTCGACGCTCTCGAACAGGACGAATTCTGTAAATTTGAAATCCAGTTCGAGTTG 
GCTCATAATGCTATCCATTACTTGGTTGGCGGTAAATTTGA 

INTRON 1A-2/1A-3 (SEQ ID NO: 112) 

GTAAGTTTGGTTTACAGTTTCATTATAAAAACATAGCAGTTTTAAGTTTAGGGGCAGATT 
CTAATCTCTAATATTCCTTTCAACTCACTTTATTGGTGCCTTCTTGGAGTGACATTTAGA 
AAC T AAG AC AAG AG G AAG AT G AAC AAT G T T T G TAG G GAT AG AC AG C T T G GAT G C AAT T T C 
G GAC C AG AT T C T AAC AG C G T C AT G AAG C AAG T GAT AC AC AAC G T TAT C AAT AAC G AGAAT 
AT AC AC AT AG AT G G T T T GAG T T TAT AAAT G AAC TAT T AAC G G CAT T G T G G T TAT AG AC AG 
TGAGGAAGACGCCAGATAGACAAAGGGTAGGGGCCTTGGTTAGATAATGAGAAGTTGAAG 
AGGTGTAATAACTTAAATCTCTCTTGACTATTGATTGTGTCTAAGAGTTTTCTTATCTTA 
CAGTCGGCCAGTTGGGTCAAAGATGGTGTGATTCGGATGTGCTTTGTGTGTTCTGCGATG 
GCTGATTTAGAGTCAGTTTACTTCAGATGAATGAAGTTCCCCGATTCTTATGTTTAAGTT 
TGTTTCACCTACGCATGAAGACATCACCAGCAGGGTCGTCTTTATTTCTAGTAGCTTATT 
TACAGCAAGCTTGTAACGTATGCTGAATTGCTGTGCCTCTGTAGAACACAGCATCTATGT 
TTGCTTGCTTCTTTAGTAGACTGCGGATGTGATGGTTGGTTACCTGGTATGCTGACGAAA 
GAATTGTTGACGTGGTGGTTTGCCTTGATGGGTTCGTTGACTTGGTTTGTTGGATACTGA 
TTAAGGTGACTCTGCTGGGAGGCTTGGATTCTGGGGCCGGTGTTCTTTGCTCTCCTGTCT 
AGGGTGGCGATTATTTCCCAACCCACTTGTTCCATTACACTCAAAACCTGCTATCAATTT 
ACAG 

DOMAIN 1A-3 (3rd part of domain a) 

ATATTCAATGTCAAACTTGGAATACACCTCCTACGACCCCATCTTCTTCCTCCACCACTC 
CAACGTTGACCGCCTCTTCGCCATCTGGCAGCGTCTTCAGGAACTGCGAGGAAAGAATCC 
C AAT G C AAT G GAC T G T G C AC AT G AAC T C GC T C AC C AG C AAC T C C AAC C C T T C AAC AG G G A 
C AG C AAT C C AG T C C AG C T C AC AAAG GAC C AC T C GAC AC C T G C T GAC C T C T T T GAT T AC AA 

acaac t t ggat acag 

intron 1a-3/1a-4 (seq id no: 113) 

gtgagacattattacacttctatttagtagtgggggcgggatagctcaggtggtagagcg 
tcggccttcagcttctagtctcgcccacaagagcgcgctggctaaagggccggagttaga 
ttcccgcgggcggcaggcaatatctccgaaggggagaacagttctccagtcggtgAaatt 
ggggtgcaatgttgtaccactgaaatgcgtgcagcaccaaccatccaaataccagccttg 
ccgcgctggtctgactacatagtaccacccggattcaaccgggctatataggttctcctc 
cagcagtaaatctgacagtcgccatatagctgggatattgctgagtgcgacgttaagccc 
c aac t c ac t c ac t t tat at t tag tattctatt tag t at c gac g cat gac cat gtgtggtg 
gtctactcatctcaacacgaccgattaacgttaagagctgccaacatgattctctttctc 
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T C T T TAG C C T C T T TAT G C C AAAAG C TAT AT AT T AAT G TAG G AC C C T AC AT AT AT T AT T T C 
CAG 

DOMAIN 1A-4 (4th part of domain a) 

C T AC G AC AG C T T AAAC C T G AAT G GAAT G AC G C C AG AAC AG C T G AAAAC AG AAC TAG AC GA 
ACGCCACTCCAAAGAACGTGCGTTTGCAAGCTTCCGACTCAGTGGCTTTGGGGGTTCTGC 
CAACGTTGTTGTCTATGCATGTGTCCCTGATGATGATCCACGCAGTGATGACTACTGCGA 
GAAAGCAGGCGACTTCTTCATTCTTGGGGGTCAAAGCGAAATGCCGTGGAGATTCTACAG 
ACCCTTCTTCTATGATGTAACTGAAGCGGTACATCACCTTGGAGTCCCGCTAAGTGGCCA 
CTACTATGTGAAAACAGAACTCTTCAGCGTGAATGGCACAGCACTTTCACCTGATCTTCT 
TCCTCAACCAACTGTTGCCTACCGACCTGGGAAAGGTCACCTTGACC 

INTRON 1A-4/1B (SEQ ID NO: 114) 

GTAAGTTGATTGTCTTAATATTGTTTTAATTTTTGCAGAAATTTGATTTTAAATTGTGTA 
AT AAC AG T AC AC AT T T T T AC G C AAC AG CAG T CAT TATTGTGTGT G AAG AT G T C AAAC CAG 
AAAG G T T T C AAT C G T GAAAAC AAAAAC AAT T C T C TAT C T G TAT AC C C C T C AAT AC CAG T A 
T GAT C AC AAAT C TAG GAAAT AT T AC AAT AC T GC T T CAT AG AG T AAC TGCTGTTTGT GG C A 
GAG C T G GAT AC G AAG T T T C T GAT AG T T C AC AG C T AC AT GAT AG T AAAT G AAC C T G T AC AC 
ATCAACGGTTGATCATGAAAATTTTGTATGTGTGAAAGTGCTACCTGTATTAGTGAACGT 
GCTACCTGTATAACTGAAAGTGCTACCTGTATGACTGAAAGTGCTACCTGTATGCTGAAA 
GTGCTACCTGTATTAGTGAACGTGCTACCTGTATAACTGAAAGTGCTACCTGTATGACTG 
AAAGTGCTACCTGTATTAGTGAAAGTGCTACCTGTATGAGTGAACGTGCTACCTGTATAA 
CTGAAAGTGCTACCTGTATGACTGAAAGTGCTACCTGTATTAGTGAAAGTGCTGCCTGTA 
TTAGTGAAAGTGCTACCTGTATGACTGAGCGTGTTACCTGTATGACTGAACGTGCTACCT 
GTATTAGTGAAAGTGTAATCTGTATGAGTGAAAGTGCTACCTGTATTAGTGAAAGTGCTA 
C T T G TAT TAG T GAAAG T G C T AC AT G TAT GAC T G AAAG T G C T AC AT G T AT GAAT GAG AG T G 
CTACCTGTGTGACTGAAAGTGCTACCTGTATTAGTGAAAGTGCTACCTGTATGACTGAAC 
GTGCTACCTGTATTAGTGATAGTGTCACTGGTACCAACTGGATGTTCTCACTTCTTTGGC 
GAATATCTGGGCTCAAAACAGTTTTTCAGTATCATAGTCGTATCAGTTTGATTTGTATGT 
GCAGTGGAATCATTTTCGTCAAATAATCAAAACTGGTGTTGAACTGGCGTTCACGTTTTA 
T G G T T G T AAAAC AAAT T C T G T AAG T AAAGAT AT T T TAG G GAT AT C T G T AT GAC AT G AAC T 
GAATTGCTTAAGGTTAGCATGCCATGACAAATTGCTGAATGTCTGAGGATTGGTGGAGCA 
AT AAAT CAT TAT T AAGACAAAAAT C AGAAAC G T C CAT T T T C AC T T T T AAC AG T G TAT C T G 
T C T GAAT G C C C C C T AC T T T T T G G AAG AG TAT AT AT GAAT T AT C G G C AAT AT AAAAC G T T A 
AATGGCAAATGTCGGGCATATGTCAGGACATTATTACCGCAGTTTATAGTCATATTTACC 
GGGTCTAGGACAATTGTCACCCCGACAATTGCCACCCGGACAATTGCCACCCAAAAATAA 
AAT AT AC G T AAAC AGAAAACAAAT AT T GC T T T C AGC C T T TAT T GAG T T AGAT AAT GAC AT 
T TAT G T T GAT AAAT AT G T CG T T T GAT AAT AAT AAT AAC AAT AAT AT AAT AT T AC AAT AC T 
GC AAT AG T AC TAT CAG T AC T TAT CAT T T TAT C AC AGAT TAT AT AT AGAT T C T AGAG T CC G 
ATGTTGTAGGCAACACTTCGTCGGTAGGCCGTTAGGTAGTTATCATTAGGGCTGAGTATT 
GCGCCAAATTTCGTATTGCTATATACTGCGATACACGGTTACCTGTTTTGCAATACGTAA 
ACTTAGGCAAATATGACAGTTTTTCCATGATTATTTTCACGTTTCAATGCTTAAAATGGT 
C T TAT C T G T TAT C T C C T T G AAG G T T TAATAAAATAACAAT AAAC AT AAAT CAT TAT T GAA 
AAT T AAT G AAC AAAAG T AAAG C G C T T C T CAG T T AC C T T AAC C T AAC T T AT T TAT GAAT G G 
GAT T AC T AT C C AAGAAT G T GAAAT T C AC AAAC AC C T T G G GAT AAC AC T G C AAAAC GAC T G 
TTCATGGGACGGACATGAAAAAGGTGAGTCCCATGTTAAACTGTTGAGAAAGTTTCCTAT 
ACTGTTTGTCCCGAAAAAGGCTAAAGACCATGTACTAATCAATTATTCTATCTATTTTCG 
AT T AC T G T T C T CAT AT T T G G GAC AAC T G T G CAG AT C G G TAG CAT C C AAGC T C G T C T AAAT 
CGGTTTGATAAACCTTGTCAAATAACATGTTGTCTCAACATCCAAGCTCACCTAAACCTT 
G T C AAT AC C T G CAT C T G AAC AAAT G TAT AT T T AAG AC GAT AG CATC C AAG C T CAT C T T T A 
AAATGAATATTTTCTCTTTTTCTACCAAAACATTATTTGGTTGACAGTTGTCCTCCCTAT 
TATAGTAAAAAGAACTGGGTGGCAATTGTCCTAGGTGGCAATTGTCCGGATGGCAATTGT 
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CCGGGTGGCAATTGTCCGGGTGGCAGTTGTCCAGGTGGCTATTGTCCTGTTCCCATATTT 
ACGTATCCCATTTTCTGCTCTGTAATTTTAAATAAACTCACCTGCCTAAGGTAAGACGAC 
ATGTGTCACGTGAACATCGTTTGGGGGCAAGGGCGGAATCCCTTCGTTGAAAGTAAATGA 
AT AC T G T AC AT AG AG AT G C G TAT C T T G AAC T C T T TAT TAG C T T T GAT AT T G T G C T T AAT A 
TTACATGAATGTATTTCAATATGTAATTATGTGTTCAAATGAATGGTTGACTTGAATGGT 
TTTATTGCTTTATATGCTACATCAACATGTGTGTTTCTTTTCATTTCAG 

DOMAIN IB 

C AC C T G T G CAT CAT C G C C AC GAT G AC GAT CTTATTGTTC G AAAAAAT AT AG AT CAT T T G A 
CTCGTGAAGAGGAATACGAGCTAAGGATGGCTCTGGAGAGATTCCAGGCCGACACATCCG 
TTGATGGGTACCAGGCTACAGTAGAGTACCATGGCCTTCCTGCTCGTTGTCCACGACCAG 
ATGCAAAAGTCAGGTTCGCCTGTTGTATGCATGGCATGGCATCCTTCCCTCACTGGCACC 
GGCTGTTCGTTACCCAGGTGGAAGATGCTCTTGTACGGCGTGGATCGCCTATCGGTGTTC 
C T TAT T G GG AC T G G AC AAAAC C TAT G AC T C AC C T T C C AGAC T T G G CAT C AAAT GAG AC G T 
AC G TAG AC C C G T AT GG AC AT AC AC AT CAT AAT C CAT T C T T C AAT G C AAAT AT AT C T T T T G 
AGGAGGGACACCATCACACGAGCAGGATGATAGATTCGAAACTGTTTGCCCCAGTCGCTT 
TTGGGGAGCATTCCCATCTGTTTGATGGAATCCTGTACGCATTTGAGCAGGAAGATTTCT 
GCGACTTTGAGATTCAGTTTGAGTTAGTCCATAATTCTATTCATGCGTGGATAGGCGGTT 
CCGAAGATTACTCCATGGCCACCCTGCATTACACAGCCTTTGACCCCATTTTCTACCTTC 
ATCATTCCAATGTCGATCGTCTATGGGCAATCTGGCAAGCTCTTCAAATCAGGAGACACA 
AGCCATATCAAGCCCACTGTGCACAGTCTGTGGAACAGTTGCCAATGAAGCCATTTGCTT 
T C C CAT C AC C T C T T AAC AAC AAC G AGAAG AC AC AT AG T CAT T C AG T C C C GAC T G AC AT T T 
ATGACTACGAGGAAGTGCTGCACTACAGCTACGATGATCTAACGTTTGGTGGGATGAACC 
T T G AAGAAAT AG AAG AAG C TAT AC AT C T C AG AC AAC AG CAT G AAC GAG TCTTCGCGG GAT 
TTCTCCTTGCTGGAATAGGAACATCTGCACTTGTTGACATTTTCATAAATAAACCGGGGA 
ACCAACCACTCAAAGCTGGAGATATTGCCATTCTTGGTGGTGCCAAGGAAATGCCTTGGG 
CGTTTGACCGCTTGTATAAGGTCGAAATAACTGACTCATTGAAGACACTTTCTCTCGATG 
T C GAT G GAGAT TAT GAAG T C AC T T T T AAAAT T CAT GAT AT GC AC G GAAAC GC T C T T GAT A 
CGGACCTGATTCCACACGCAGCAGTTGTTTCTGAGCCAGCTCACC 

INTRON 1B/1C (SEQ ID NO: 115) 

GTAAGTAAATTTACAAAATTTGGTGTTCTCTAACTATCCTAAGTATTCAATCGTTAGCGT 
G T AC C T AT C T G CAT AAT G C AAT AC C C T GAC T C CAT AT AAG TAT AG TAT AT T T AC T C T GG T 
CGAAAACAAACAAATTGAAAACAAGAGTGGACGTGCTGTTATGATTTCTTTTTCATTCTT 
GGTTCGTTGTGTAATGCCACAGCCAGCAATTCCAGATATATAGCGACGGTCTATGAATAC 
TCCAGTCTGGACCAGACAATCGTGTGGAATGGTTTAGGCACATTATATCAAATTCATTGT 
TGAAGATATGAGTTATGAGGTCACAATGTTGTCTTGTTACCCCGTGTCAGTAGTGACGTC 
AT T T CAT GAC T G AAAT C T C T T C AAC G C C G T T TAG C AAT AAT AG G C T C AG TAG TAT T C AAC 
C AAT T AC AAT C AG TAG AAAAT T C T C TAT AC TAT T C T TAT GTTGCATCCT GAT AT C C C T AT 
G C AAAAAT TAG T CAT C T AAT AT AAT C AT T T T C GAT AAAT AC T T T G G G C AAAC AAAT C AAT 
GTAACATCTATTTTCTTTCAG 

DOMAIN 1C 

C T AC C T T T GAG GAT G AAAAG C AC AG C T T AC GAAT C AG AAAAAAT G T C GAC AG C T T GAC T C 
CTGAAGAAACAAATGAACTGCGTAAAGCCCTGGAGCTTCTTGAAAATGATCATACTGCAG 
GTGGATTCAATCAGCTTGGCGCCTTCCATGGAGAGCCTAAATGGTGCCCTAATCCTGAAG 
CGGAGCACAAGGTTGCATGCTGTGTTCATGGCATGGCTGTTTTCCCTCATTGGCACAGGC 
TTCTTGCTCTCCAGGCGGAGAATGCTCTTAGAAAGCATGGGTACAGTGGTGCTCTACCAT 
ACTGGGATTGGACTCGCCCCCTTTCCCAACTTCCTGATCTGGTTAGTCATGAGCAGTATA 
C AG AT C C T T C C GAC CAT C AC G T GAAG CAT AAC CCGTGGTT C AAT G GC C AC AT C GAT AC AG 
TAAATCAGGATACCACCAGAAGCGTACGGGAGGATCTTTATCAACAACCTGAATTTGGAC 
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ATTTCACGGATATTGCTCAACAAGTCCTCTTAGCATTAGAACAAGATGACTTCTGTTCGT 
TTGAAGTGCAGTATGAGATTTCCCATAATTTTATCCATGCACTTGTAGGAGGAACCGACG 
C T TAT G G CAT GG C AT C G C T GAG AT AT AC AG CAT AC GAT C C AAT CTTTTTCTT GC AT CAT T 
CAAACACCGACAGGATCTGGGCTATTTGGCAATCCCTGCAAAAATACAGAGGCAAACCGT 
ACAACACTGCCAACTGCGCCATAGAATCTATGAGAAGGCCCCTGCAACCATTTGGACTAA 
GC AG T GC CAT T AAC C C T GAC AGAAT C AC C AGAGAGC AT G C TAT C C C G T T T GAT G T C T T C A 
AC TAT AG AG AT AAC C T T CAT T AC G TAT AT GAT AC C C T G GAAT T T AAT GGTTTGTC GAT T T 
CACAACT T GAT AGAGAGC T GGAAAAAAT C AAGAGTCAC GAAAGAGTAT TTGCTGGAT T C T 
TGCTGTCGGGGATTAAAAAATCTGCTCTTGTGAAATTCGAAGTTTGTACTCCACCTGATA 
ATTGTCATAAAGCAGGGGAGTTTTATCTACTCGGGGACGAAAACGAGATGGCTTGGGCCT 
AT GAC C GAC T T T T C AAG TAT GAT AT T AC T C AGG T T C T G GAAG C AAAC CAT C T AC AC T T C T 
ATGATCATCTCTTCATTCGCTACGAAGTCTTTGATCTTAAAGGAGTGAGTTTGGGAACTG 
ACCTGTTCCACACTGCAAATGTGGTACATGATTCCGGCACAG 

INTRON 1C/1D (SEQ ID NO: 116) 

G T AC G T G GAT T T GAT T AC AT AG C AAT G C TAT AT GAT T T C AG T AAT T AC AAC C T C AAG T C A 
TGTAGCCGTTTTAGATTGCATTACATCAAACAGCATTGGATTAAATTGGGGGATTGTCCA 
GGCCGCATTATGTTGCATTCCGAAAATAGTTTGTGTCCAGTGTCCACGTTTAAAATTAAA 
CCAT T T TAAT CAT AT T AGGGATAAT T T TAAT AGAT GT T AT AGT GC T T TAT T TCATAT TGT 
T AC AG T G GAC AG T C AC C AAG GAC AT AT T T T AC T C TAT AG AT AC AC AAAC AC C AAT T AAAA 
CCCTGCTTTGGAAAGTCTAACTTTTTCCCCACAG 

DOMAIN ID 

GCACCCGTGATCGTGATAACTACGTTGAAGAAGTTACTGGGGCCAGTCATATCAGGAAGA 
ATTTGAACGACCTCAATACCGGAGAAATGGAAAGCCTTAGAGCTGCTTTCCTGCATATTC 
AGGACGACGGAACATATGAATCTATTGCCCAGTACCATGGCAAACCAGGCAAATGTCAAT 
TGAATGATCATAATATTGCGTGTTGTGTCCATGGTATGCCTACCTTCCCCCAGTGGCACA 
GACTGTATGTGGTTCAGGTGGAGAATGCTCTCCTAAACAGGGGATCTGGTGTGGCTGTTC 
CTTACTGGGAGTGGACTGCTCCCATAGACCATCTACCTCATTTCATTGATGATGCAACAT 
ACTTCAATTCCCGACAACAGCGGTACGACCCTAACCCTTTCTTCAGGGGAAAGGTTACTT 
TTGAAAACGCAGTCACAACAAGGGACCCACAAGCCGGGCTCTTCAACTCAGATTATATGT 
ATGAGAATGTTTTACTTGCACTGGAGCAGGAAAATTATTGTGACTTTGAAATTCAGTTTG 
AGCTTGTTCATAACGCACTTCATTCCATGCTGGGAGGTAAAGGGCAGTACTCCATGTCCT 
CCCTGGACTATTCTGCGTTTGATCCCGTCTTCTTCCTACATCATGCCAACACGGACAGAC 
TGTGGGCAATCTGGCAGGAACTACAAAGATTCCGAGAACTGCCTTATGAAGAAGCGAACT 
G T G C AAT C AAC C T CAT G CAT C AAC C AC T GAAG C C G T T C AG T GAT C C AC AT G AGAAT C AC G 
AC AAT G T C AC T T T GAAAT AC T CAAAAC C AC AGG AC G GAT T C GAC TACCAGAACCACTTCG 
GATACAAGTATGACAACCTTGAGTTCCATCACTTATCTATCCCAAGTCTTGATGCTACCC 
TGAAGCAAAGGAGAAATCACGACAGAGTGTTTGCGGGCTTCCTTCTTCATAACATAGGAA 
CTTCTGCTGACATAACTATCTACATATGTCTGCCTGACGGACGGCGTGGCAATGACTGCA 
GTCATGAGGCGGGAACATTCTATATCCTCGGAGGCGAAACAGAGATGCCTTTTATCTTTG 
ACCGT T TGTATAAAT T TGAAATCACCAAACCAC TGCAACAG T TAGGAGTCAAGCTGCATG 
GTGGAGTTTTCGAACTGGAGCTTGAGATCAAGGCATACAACGGTTCCTATCTGGATCCCC 
AT AC C T T T GAT C C AAC TAT CAT C T T T GAAC C T G GAAC AG 

INTRON ID/IE (SEQ ID NO: 117) 

G TAAT G C C AT C T TAAT AC AG TTCGTTCGT T AAAT TAT AT AT G T T C G T T T AC AAC AC CAT A 
CCTTGAATTGAGGTAATACATCACTTGATATTGATAATGTAATGGTAATTGTTCTTGTTT 
GTAAAACCGTTTCTGGGGTGTTTATTCACTATCCACCTGGTGGATAGTGAGTAAACACAT 
TCGGTTTAATATGGGTATCTAATGGACAGTGAAGTGTGCTGGCTAGGCAGATACCTTGGT 
TTCTGTGAATGGAGGTAGTAGAAAGGGGTTTTGATGATTGCAG 
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DOMAIN IE 

AT AC C CAT AT C T T G G AC C AC G AC CAT GAG G AAGAG AT AC T T G T C AG G AAG AAT AT AAT T G 
ATTTGAGCCCAAGGGAGAGGGTTTCTCTAGTCAAAGCTTTGCAAAGAATGAAGAATGATC 
GCTCCGCTGATGGGTACCAAGCCATTGCCTCTTTCCATGCCCTGCCACCACTCTGTCCCA 
ATCCATCTGCAGCTCACCGTTATGCTTGCTGTGTCCATGGCATGGCTACATTTCCCCAGT 
GGCACAGACTGTACACTGTTCAGGTTCAGGATGCCCTGAGGAGACATGGTTCACTTGTTG 
GTATTCCTTACTGGGACTGGACAAAACCAGTCAACGAGTTACCCGAGCTTCTTTCTTCAG 
CAACATTTTATCATCCAATCCGGAATATTAATATTTCAAATCCATTCCTCGGGGCTGACA 
TAGAATTTGAAGGACCGGGCGTTCATACAGAGAGGCACATAAATACTGAGCGCCTGTTTC 
ACAGTGGGGATCATGACGGATACCACAACTGGTTCTTCGAAACTGTTCTCTTTGCTTTGG 
AAC AG G AAG AT T AC T G C GAT T T T G AAAT AC AAT T T GAG AT AG C C CAT AAT G G CAT C C AC A 
CATGGATTGGTGGAAGCGCAGTATATGGCATGGGACACCTTCACTATGCATCATATGATC 
CAATTTTCTACATCCACCATTCACAGACGGACAGAATATGGGCTATTTGGCAAGAGCTGC 
AGAAGTACAGGGGTCTATCTGGTTCGGAAGCAAACTGTGCCATTGAACATATGAGAACAC 
CCTTGAAGCCTTTCAGCTTTGGGCCACCCTACAATTTGAATAGTCATACGCAAGAATATT 
C AAAG C C T GAG G AC AC G T T T G AC TAT AAG AAG T T T GGAT AC AG AT AT GAT AG T C T GG AAT 
TGGAGGGGCGATCAATTTCTCGCATTGATGAACTTATCCAGCAGAGACAGGAGAAAGACA 
GAACTTTTGCAGGGTTCCTCCTTAAAGGTTTTGGTACATCCGCATCTGTGTCATTGCAAG 
T T T GC AG AG T T GAT C AC AC C T G T AAAGAT GC GGGC T AT T T C AC TAT T C T GGGAGGAT C AG 
CCGAAATGCCATGGGCATTCGACAGGCTTTATAAGTATGACATTACTAAAACTCTTCACG 
AC AT GAAC C T GAG G C AC GAG G AC AC T T T C T C T AT AGAC G T AAC TAT C AC G T C T T AC AAT G 
GAACAGTACTCTCGGGAGACCTCATTCAGACGCCCTCCATTATATTTGTACCTGGACGCC 

INTRON 1E/1F-1 (SEQ ID NO: 118) 

G T GAG T AC C T G T T T GC AC T AAG AC T T C T G TAG G C TAAAAG T G T AAGAAAT AT C AAT T AAT 
TTCAATTCACCCAAACTTGAAAACGGTACCTATATAGGTTAACTTTTTGTCTACAGTAAA 
C T GAAC AT AC C T AC AC AT T T CAT GAAAT GAT C T C T C AAT AT T T T C C AC C AAC AG 

DOMAIN 1F-1 (1st part of domain f) 

AT AAAC T C AAC T C AC G GAAAC AT AC AC C T AAC AG AG T C C GC CAT GAG C T AAG T AGC C T T A 
GTTCCCGTGACATAGCAAGCTTGAAGGCAGCTTTGACAAGCCTTCAACATGATAATGGGA 
CTGATGGTTATCAAGCTATTGCTGCCTTCCATGGCGTTCCTGCGCAGTGCCACGAGCCAT 
CTGGACGTGAG 

INTRON 1F-1/1F-2 (SEQ ID NO: 119) 

GTAAAT T TACAGAGCT T TATGAAGTGTGT TCAGAGT GAAGAGACCAAGATAT AC T TATAC 
C C AAAAC TAG C TAG C AAC AG AC GAT T T C AC TTGTTTCG G AC AC T T T G T AT TAT AC G T T G G 
ATCCCAAGGTAAACGGAAACGTAACCGAGAATCAGTCCGTAAAGTGAGTGAGTGAGTTTG 
GGGCTTAACGTCGCACTCAGCAATACCCCAGCTATGTGGCGACTCTCAGATTTACTGCTG 
GAGGAGAACCTACATAGCCCGGTTTAACCCGTGTGGTATGTAGTAAGACCAGCGCGGCAT 
GGCTGGTATCTGACGGACGAAGGGTGGCGCTGCACGTATTCCAGTGGTACAACACTGCAC 
CCCAATTTCACCGACCGGAGAACTGATCTCCCCTTCGGAGATATCGCCTGCCTTCCACGG 
GATTCGAACTCGGTGACCTTCAAGCCAGCGCGCTTCTAGCGGGGGCGATTAGAGGTTNAA 
GGCCGACGGCTCTACCACCTTAACTATCCCCCGGCCCCACTCCTGACGGAAATGTTTATA 
ATTCAGCCTTTGTTTTCTTATTAAACACTCTTGGCAGATTTTCTATAGATAATGGATTCA 
CAT G TAG AC AG TCTCCCATTGTTG T AAC T G G TAG T C AAG AG T TAG AAT C T G AAT AC AT T C 
T C C AAG AT G GAT C AAG G AAAAC AAT AAT T AC T T GAT G T T G C AG 
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DOMAIN IF- 2 (2nd part of domain f) 

ATCGCCTGTTGCATCCACGGCATGGCGACGTTTCCTCACTGGCACCGGTTGTACACTCTG 
CAGTTGGAGCAAGCGCTGCGCAGACACGGGTCCAGTGTTGCTGTTCCATACTGGGACTGG- 
AC C AAGC C AAT C AC C G AAC T G C C AC AC AT T C T G AC AG AC GG AGAAT AT TAT G AC G T T T G G 
CAAAATGCCGTCTTGGCCAATCCGTTTGCAAGAGGTTATGTGAAAATTAAAGATGCATTT 
ACGGTGAGAAATGTCCAGGAAAGTCTGTTCAAAATGTCAAGTTTTGGAAAGCACTCGCTT 
CTGTTTGACCAGGCTTTGTTGGCTCTTGAACAAACTGACTACTGTGACTTCGAAGTTCAG 
TTTGAAGTGATGCATAACACGATCCATTATCTCGTAGGAGGGCGTCAAACGTACGCCTTC 
TCCTCTCTCGAGTATTCCTCATACGATCCAATCTTCTTTATTCACCACTCGTTTGTTGAC 
AAAATATGGGCTGTATGGCAAGAACTGCAAAGCAGGAGACATCTACAGT T TAGAACAGCT 
GATTGTGCTGTGGGCCTCATGGGTCAGGCAATGAGGCCTTTCAACAAGGATTTCAACCAC 
AAC T C G T T C AC C AAG AAGC AC G C AG T C C C T AAT AC AG TAT T T GAT TAT GAAG AT C T T GG C 
TAT AAC TAT G AC AAC C T T G AAAT C AG T G G T T T AAAC T T AAAT GAG AT C GAG G C G T T AAT A 
GCAAAACGCAAGTCACATGCTAGAGTCTTTGCTGGGTTCCTGTTGTTTGGATTAGGAACT 
TCGGCTGATATACATCTGGAAATTTGCAAGACATCGGAAAACTGCCATGATGCTGGTGTG 
ATTTTCATCCTTGGAGGTTCTGCAGAGATGCATTGGGCATACAACCGCCTCTACAAGTAT 
G AC AT T AC AG AAGC AT T G C AGG AAT T T G AC AT C AAC C C T GAAG AT G T T T T C CAT G C T GAT 
GAACCATTTTTCCTGAGGCTGTCGGTTGTTGCTGTGAATGGAACTGTCATTCCATCGTCT 
CAT C T T C AC C AG C C AAC GAT AAT C TAT G AAC C AGGC G AAG 

INTRON 1F-2/1G-1 (SEQ ID NO: 120) 

GTGAGATATATGCAAATTGAATGTTGTCCAGATGCGTTGTTTACATTTATATGCTTGGAA 
T T G T CC T G AAC GAAT AC AG T GGAAT AAC C AAAAGC T GAAAAAT AAAAAGAT AT AT AC T T C 
ATTCTGAATTTGTCAGTATTGCTGACCCAAAAACACGTTATCCATGTCGACACTATATTT 
GCCTTTCT GAAT C T GAG AC T G C G T TAT G T T T C T AAT AAT C AC GAAAT AT G G TAT AC AG G T 
TGTGTATCTGTAGAATACCCAAGGCAGAATTTAAAGGGTCACACCCTGTTTAATACAG 

DOMAIN 1G-1 (1st part of domain g) 

ATCACCATGACGACCATCAGTCGGGAAGCATAGCAGGATCCGGGGTCCGCAAGGACGTGA 
ACACCTTGACTAAGGCTGAGACCGACAACCTGAGGGAGGCGCTGTGGGGTGTCATGGCAG 
ACCACGGTCCCAATGGCTTTCAAGCTATTGCTGCTTTCCATGGAAAACCAGCTTTGTGTC 
C CAT G C C T GAT G G C C AC AAC T AC T CAT G T T G T AC T C AC G 

INTRON 1G-1/1G-2 (SEQ ID NO: 121) 

GTAAGTTTGTGTTGGTTAGTGTTGGTTGCATGTTTTGCCATATCGATAGTATCAGTGTGG 
TAACATCTGGTTTCTAGTTCATTCAGTTCACCTTATCAGAAGCTGTTTGCTCTCGTCTAC 
AATAGTGACGTCTTTCAGTTTTAGAACCGTGTACATCCGGGTTATATTGGTCTCCAGCAA 
CCCGTGCTTGTCGTGGGAGGCCACTGATGGGAACGGGTGGTCAGACTCGCTCACTTAGTT 
GACACATGTCAATTGCGAAGATCGATGCTGAGGTTGTTAAACATTGGATTGTCTGGTCCA 
GACTCGATTATTTACAGACAGCCGCCATGTACCTGGAATATTGCTGAGTGCGGCGTTAAA 
C AAC AAAC TAG T C AG AC T AAT C T T T C AC T G T T TAT AAT GAT GG C T C G AAC C TAG C AC T C A 
TGTCCCAAGTTGGCGAACATCTGGAAGGGAATTTCAAATGAAAAGAACAATCTTTCACGT 
CTATTGGTATCACGCTCCTGGAGAAGAACATGATGTTCACGGCGTTACTTCCTCTTACCT 
GTTTTACTTGTTCCCACGTTTCTTCATATTTAAAGAGTATTTGGGTATTAGAGCTTTGGT 
GCTGTTACAATGCTACTCAACTGTTCAGTGCGGGCGACCGCGCTTGTTTACACATTAAGT 
TTTGTTTGTTGGTTGGTTTGTGTGTGTGTGTGTATGTGTGTGTGTGTGTGTGTGTGTGTA 
TGTGTGTGTGTGTGTATCTATGTCTATGTGTCTGTGTCTGTGTGTCTGTCTATGTGTGTG 
TGTGTCTGTGTCTATGTGTGTGTCTGCGTGTGTGTCTGTGTCCGTATGTGGCTGTGTCTA 
TGTGTGTGTGTGTCTGTGTTTATGTGTGTATATGCGTGTGTGTCTGTGTCCGTATGTGGC 
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TGTGTCTATGTGTGTGACATGCAATACATGCTGTGATACTCACTAGCTGCGTCTATCGAC 
CAG 

DOMAIN 1G-2 (2nd part of domain g) 

GCATGGCTACCTTCCCACACTGGCATCGCCTCTACACCAAGCAGATGGAGGATGCAATGA 
GGGCGCATGGGTCTCATGTCGGCCTGCCCTACTGGGACTGGACTGCTGCCTTCACCCACC 
T G C C AAC AC T GG T C AC C G AC AC G G AC AAC AAC C C C T T C C AAC AT 

INTRON 1G-2/1G-3 (SEQ ID NO: 122) 

GTAAGAGCGGGGTAGGGATGGGGTGGTAGGGGGTGGGTTGTTCTATTACTTCCCGCTTCA 
CTTGTATGAAATGGATAACCTTGGCTGCATCCCAATTGCGTGATCGATTCTCTTTCGATT 
CACTCGTGCGATTAGACTGCCTTATTTACTATAGTAGTTAGAATGTTGCTCAGTGCGCCG 
T T AAAC AAC T AAT AC AC AAAAC C G CAT T T G T T T TAT AT G G T C AC T C T AC T G T T TAT C AC G 
TATATGTATGTTCCGACTCACTGGTTGGTGCGTACCATTCTACTGTCACACTGAGAGCCA 
ATGTTCTCAGATGTGTGAAATGTTTGAAAGCCGTTTCTACATAATATTGCAGGAATACCA 
TTGTAGAATGTAGTCAAACAGGTAACAATCTGTTAGTGAGCCCAGTTCGAGGTTGCGTTG 
TAGGGTGTAGTCCAACAGGTAGGCAGTCCATAAGCATAGTTTTTAAGCATTTTAGATCAT 
C TAT AAT T AAC C AC AT GG T TAG CCGCTATGTT TAG T T T AAT C CAG T AT AAG T T AGAAC T G 
T TAT AT T T C GAAGGGAAG T GAG T AAAT C C T TAT T C C T T GAC T AC CAT T T AAT AGAT T T C C 
CAAT GAC T CC AT T CAAC T C C T AAC T T T C AC AT C AC TG C T C T C T T C AAC AG 

DOMAIN 1G-3 (3rd part of domain g) 

G GAC AC AT T GAT TAT C T CAAT G T CAG C AC AAC T C GAT C T C C C C GAG AC AT G C T G T T CAAC 
GACCCCGAGCATGGATCAGAGTCGTTCTTCTACAGACAAGTCCTCTTAGCTCTGGAACAA 
ACTGATTTCTGCAAATTCGAAGTTCAGTTTGAGATAACCCACAATGCCATCCATTCCTGG 
ACAGGTGGCCACAGCCCCTACGGAATGTCCACTCTCGACTTCACTGCCTACGATCCTCTC 
TTCTGGCTTCACCACTCCAACACCGACAGAATCTGGGCTGTCTGGCAAGCTTTGCAAGAA 
TACAGAGGACTTCCATACAACCATGCCAATTGTGAGATCCAGGCAATGAAAACGCCCCTG 
AG GC C T T T CAG T GAC GAT AT CAAC C AC AAC C CAG T C AC AAAG GC T AAC G C G AAG C CAT T A 
GAT G T G T T C GAG TAT AAT C G G T T GAG C T T C CAG T AC GAC AAC C T CAT C T T C CAT GG AT AC 
AGTATTCCGGAACTTGATCGCGTGCTTGAAGAAAGAAAGGAGGAGGACAGAATATTTGCT 
GCCTTCCTTCTCAGTGGAATCAAGCGTAGTGCTGATGTAGTGTTCGACATATGCCAGCCA 
GAACACGAATGTGTGTTCGCAGGGACTTTTGCGATTTTGGGAGGGGAGCTAGAAATGCCC 
TGGTCCTTCGACAGACTGTTCCGCTATGATATCACCAAGGTGATGAAGCAGCTACACCTG 
AGGCATGACTCTGACTTTACCTTCAGGGTGAAGATTGTCGGCACCGACGACCACGAGCTT 
CCTTCAGACAGTGTCAAAGCACCAACTATTGAATTTGAACCGGGCG 

INTRON 1G-3/1H (SEQ ID NO: 123) 

GTGAGTACGACAGGCATTTCTAGTAAAAACCTACTTTTGGTAAAAGGTTCGAGAAATCAC 
T T GAAG CAAC AAC AT GAT T T T G T AAC G C C TAT T AC AC G T G AAC AT G T C AC AC C C G G T GAT 
GCCGTTTAATGGACATGCCTCTGTTAATGAAAGGGGTAAGTACATGTGTATGGGGATGGG 
ATGGGAGCCACCTGTCCCAATTTCATAGGTCCCTAGGATCCCAGTTGCGTAGGAATCCCC 
TGATTAATGCCTTGTGAATTCCTCCTGGAATTGTCCTGGCCCAAATTTTTACAAACCCGC 
CCCGATATACCTTGGAAATAATTGGGCCTAAGGGTGGGGCTTTTAAGGACCAAGAACCCA 
ACCTAAACCCCAACCCATTTTTTCCCACCCATTCCAGGTTTTGTTTTACCAAATAAAAAG 
GTTTCCACTTTGAGGAAACCCTTTAAGGGTTCTTTTCAGGGCTTTTTTTCTTTTCTGGGA 
ATTCCAATTCCGGGGGAACAAAATACATATATTTCACAGACCTTTGGTCAAATTTATATA 
ATTTCCGACTTCATGTCATAGGTTTGTCTTTCTTCCTACACAG 
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DOMAIN 1H 

T G C AC AGAGGC GG AAAC C AC GAAGAT GAAC AC CAT GAT G AC AG AC T C G C AGAT G T C C T G A 
TCAGGAAAGAAGTTGACTTCCTCTCCCTGCAAGAGGCCAACGCAATTAAGGATGCACTGT 
ACAAGCTCCAGAATGACGACAGTAAAGGGGGCTTTGAGGCCATAGCTGGCTATCACGGGT 
ATCCTAATATGTGTCCAGAAAGAGGTACCGACAAGTATCCCTGCTGTGTCCACGGAATGC 
CCGTGTTCCCCCACTGGCACCGCCTGCATACCATTCAGATGGAGAGAGCTCTGAAAAACC 
ATGGCTCTCCAATGGGCATTCCTTACTGGGATTGGACAAAGAAGATGTCGAGTCTTCCAT 
CTTTCTTTGGAGATTCCAGCAACAACAACCCTTTCTACAAATATTACATCCGGGGCGTGC 
AGCACGAAACAACCAGGGACATTAATCAGAGACTCTTTAATCAAACCAAGTTTGGTGAAT 
TTGATTACCTATATTACCTAACTCTGCAAGTCCTGGAGGAAAACTCGTACTGTGACTTTG 
AAGTTCAGTATGAGATCCTCCATAACGCCGTCCACTCCTGGCTTGGAGGAACTGGAAAGT 
ATTCCATGTCTACCCTGGAGCATTCGGCCTTTGACCCTGTCTTCATGATTCACCACTCGA 
G T T T G G AT AG AAT C T G GAT C C T T T G G C AGAAG T T GCAAAAG AT AAGAAT GAAG C C T T AC T 
ACGCATTGGATTGTGCTGGCGACAGACTTATGAAAGACCCCCTGCATCCCTTCAACTACG 
AAACCGTTAATGAAGATGAATTCACCCGCATCAACTCTTTCCCAAGCATACTGTTTGACC 
ACTACAGGTTCAACTATGAATACGATAACATGAGAATCAGGGGTCAGGACATACATGAAC 
TTGAAGAGGTAATTCAGGAATTAAGAAACAAAGATCGCATATTTGCTGGTTTTGTTTTGT 
CGGGCTTACGGATATCAGCTACAGTGAAAGTATTCATTCATTCGAAAAACGATACAAGTC 
ACGAAGAATATGCAGGAGAATTTGCAGTTTTGGGAGGTGAGAAGGAGATGCCGTGGGCAT 
AT GAAAGAAT G C T G AAAT T G G AC AT C T C C GAT G C T G T AC AC AAG C T T C AC G T GAAAG AT G 
AAGACATCCGTTTTAGAGTGGTTGTTACTGCCTACAACGGTGACGTTGTTACCACCAGGC 
TGTCTCAGCCATTCATCGTCCACCGTCCAGCCCATGTGGCTCACGACATCTTGGTAATCC 
CAGTAGGTGCGGGCCATGACCTTCCGCCTAAAGTCGTAGTAAAGAGCGGCACCAAAGTCG 
AGTTTACACCAATAGATTCGTCGGTGAACAAAGCAATGGTGGAGCTGGGCAGCTATACTG 
CTATGGCTAAATGCATCGTTCCCCCTTTCTCTTACCACGGCTTTGAACTGGACAAAGTCT 
ACAGCGTCGATCACGGAGACTACTACATTGCTGCAGGTACCCACGCGTTGTGTGAGCAGA 
AC C T C AG G C T C C AC AT C C AC G T G GAAC AC GAG TAG 

3'UTR 
TTCACAG 

INTRON 3'UTR (SEQ ID NO: 124) 

GTGAGGAGAAGGCCCCAGGCTAGCAGGGCAATGGATGAAGGAAATAGGGGCAAAGGGAAT 
AGCAGTTACACCATCGACATTTCCAACCTCCTCAGAAACTAATATATAGCCTTAATACAA 
CCAGCCAAGACTCAACGGGCAGCCGGGGTGGGGGGATTTGGTGGTCGCTGTTTCAGACCA 
GGGTGCAAAATATCAGTGCGCAAATCAACATGTTGCGTGTCAGACACTGACACAGCAGTC 
ATTGAACCTGCAGACCCATAACAGGAAAATGGGGCAGATACGATCAAAGACAGTGTAAAA 
TAGGGATAAGTAGGCATATGCAACCACCTGATGGAAATGAAAAGGGGTAAGTTTAAACCC 
CGGCTACCAAAGGTCCAATGGTTCCTTAACCCAGCTTACGCTATCCCTCTAATTTCAGTA 
TTGAGCTGATTTCTGTCGAGTTCATGTAAACTGTATACTTTCTGTATTATTACAG 

3'UTR 

GTTGCTATGCCGACTGCGCTATATTGGTGAACGAGACGATGAGGACATCTCTGAAAGAGT 
TCGCCAAGTGATGTGTAGGTCACGGAAGTATTGTTGAGCTAACAATATGATGATTTCAAA 
ATGACTTGGCGCTCTAGGACAAAGACATAATTCATCAGCACCCTGTGCACCAACTCTTTG 
TTTGCTGCAAACGTCTGACAAGCGACACGTCAATCAACAAGCTGTTCAAACTCAAGTGGA 
T G T AAC T AGAAT CGTTGGGCCATCGTT C AC AAAG TAT T G AC AG AT G T C AC AC AT GAT G G C 
GAGAAAC AC T T T AGAAC T T T T AAT GAC C TAGAG T GAC T T G T AAAT AT G TAAAT ATAT T C T 
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TCAAAGACTCAGCTGAACTATTGTTGGATAACACATCAATTCCCTCAACAAAATGCTTTA 
TCTTCACATGGATGTATGTAATGTGGCCGGCAATAAAGTATATATATGTAT 
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Figure 5 

Primary structure of the HtH1 protein 

SIGNAL PEPTIDE 
LVQFLLVALWGAGA 
DOMAIN A 

DNWRKDVS HL TVDE VQALHG ALHDVTAS TGPLSFEDITS YHAAPAS CD YKGRK I AC C VHGMP S FP 
FWHRAYWQAERALLSKRKTVGMPYWDWTQTLTHLPSLVTEPIYIDSKGGKAQTNYWYRGEIAFIN 
KKTARAVDDRLFEKVEPGHYTHLMETVLDALEQDEFCKFEIQFELAHNAIHYLVGGKFEYSMSNLE 
YTSYDPIFFLHHSNVDRLFAIWQRLQELRGKNPNAMDCAHELAHQQLQPFNRDSNPVQLTKDHSTP 
ADLFDYKQLGYSYDSLNLNGMTPEQLKTELDERHSKERAFAS FRLSGFGGSANVWYACVPDDDPR 
SDDYCEKAGDFFILGGQSEMPWRFYRPFFYDVTEAVHHLGVPLSGHYYVKTELFSVNGTALSPDLL 
PQPTVAYRPGK 

DOMAIN B 

GHLDPPVHHRHDDDLIVRKNIDHLTREEEYELRMALERFQADTSVDGYQATVEYHGLPARCPRPDA 
KVRFACCMHGMAS FPHWHRLFVTQVEDALVRRGSPIGVPYWDWTKPMTHLPDLASNETYVDPYGHT 
HHNPFFNANISFEEGHHHTSRMIDSKLFAPVAFGEHSHLFDGILYAFEQEDFCDFEIQFELVHNSI 
HAWIGGSEDYSMATLHYTAFDPIFYLHHSNVDRLWAIWQALQIRRHKPYQAHCAQSVEQLPMKPFA 
FPSPLNNNEKTHSHSVPTDIYDYEEVLHYSYDDLTFGGMNLEEIEEAIHLRQQHERVFAGFLLAGI 
GTSALVDIFINKPGNQPLKAGDIAILGGAKEMPWAFDRLYKVEITDSLKTLSLDVDGDYEVTFKIH 
DMHGNALDTDL I PHAAWSE PAH 

DOMAIN C 

PT FE DEKHSLR I RKNVDS LT PEE TNELRKALELLENDHTAGG FNQLGAFHGE PKWC PNPEAEHKVA 
CCVHGMAVFPHWHRLLALQAENALRKHGYSGALPYWDWTRPLSQLPDLVSHEQYTDPSDHHVKHNP 
WFNGHIDTVNQDTTRSVREDLYQQPEFGHFTDIAQQVLLALEQDDFCSFEVQYEISHNFIHALVGG 
TDAYGMASLRYTAYDPIFFLHHSNTDRIWAIWQSLQKYRGKPYNTANCAIESMRRPLQPFGLSSAI 
NPDRITREHAIPFDVFNYRDNLHYVYDTLEFNGLSISQLDRELEKIKSHERVFAGFLLSGIKKSAL 
VKFEVCTPPDNCHKAGEFYLLGDENEMAWAYDRLFKYDITQVLEANHLHFYDHLFIRYEVFDLKGV 
SLGTDLFHTANWHDSGT 

DOMAIN D 

GTRDRDNYVEEVTGASHIRKNLNDLNTGEMESLRAAFLHIQDDGTYES IAQYHGKPGKCQLNDHNI 
ACCVHGMPTFPQWHRLYWQVENALLNRGSGVAVPYWEWTAPIDHLPHFIDDATYFNSRQQRYDPN 
PFFRGKVT FENAVTTRDPQAGLFNSDYMYENVLLALEQENYCDFEIQFELVHNALHSMLGGKGQYS 
MSSLDYSAFDPVFFLHHANTDRLWAIWQELQRFRELPYEEANCAINLMHQPLKPFSDPHENHDNVT 
LKYSKPQDGFDYQNH FG YKYDNLE FHHL SIPS LDATLKQRRNHDRVFAG FLLHNIGT SAD I T I Y I C 
LPDGRRGNDCSHEAGTFYILGGETEMPFIFDRLYKFEITKPLQQLGVKLHGGVFELELEIKAYNGS 
YLDPHTFDPTI IFEPGT 

DOMAIN E 

DTHILDHDHEEEILVRKNIIDLSPRERVSLVKALQRMKNDRSADGYQAIAS FHALPPLCPNPSAAH 
RYACCVHGMATFPQWHRLYTVQVQDALRRHGSLVGIPYWDWTKPVNELPELLSSATFYHPIRNINI 
SNPFLGADIEFEGPGVHTERHINTERLFHSGDHDGYHNWFFETVLFALEQEDYCDFEIQFEIAHNG 
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IHTWIGGSAVYGMGHLHYASYDPIFYIHHSQTDRIWAIWQELQKYRGLSGSEANCAIEHMRTPLKP 
FSFGPPYNLNSHTQEYSKPEDTFDYKKFGYRYDSLELEGRSISRIDELIQQRQEKDRTFAGFLLKG 
FGTSASVSLQVCRVDHTCKDAGYFTILGGSAEMPWAFDRLYKYDITKTLHDMNLRHEDTFSIDVTI 
TSYNGTVLSGDLIQTPSIIFVPGR 

DOMAIN F 

HKLNSRKHTPNRVRHELSSLSSRDIASLKAALTSLQHDNGTDGYQAIAAFHGVPAQCHEPSGREIA 
CCIHGMATFPHWHRLYTLQLEQALRRHGSSVAVPYWDWTKPITELPHILTDGEYYDVWQNAVLANP 
FARGYVKIKDAFTVRNVQESLFKMSSFGKHSLLFDQALLALEQTDYCDFEVQFEVMHNTIHYL VGG 
RQTYAFSSLEYSSYDPIFFIHHSFVDKIWAVWQELQSRRHLQFRTADCAVGLMGQAMRPFNKDFNH 
NSFTKKHAVPNTVFDYEDLGYNYDNLEISGLNLNEIEALIAKRKSHARVFAGFLLFGLGTSADIHL 
EICKTSENCHDAGVIFILGGSAEMHWAYNRLYKYDITEALQEFDINPEDVFHADEPFFLRLSWAV 
NGTVIPSSHLHQPTI I YEPGE 

DOMAIN G 

DHHDDHQSGSIAGSGVRKDVNTLTKAETDNLREALWGVMADHGPNGFQAIAAFHGKPALCPMPDGH 
NYSCCTHGMATFPHWHRLYTKQMEDAMRAHGSHVGLPYWDWTAAFTHLPTLVTDTDNNPFQHGHID 
YLNVSTTRSPRDMLFNDPEHGSESFFYRQVLLALEQTDFCKFEVQFEITHNAIHSWTGGHSPYGMS 
TLDFTAYDPLFWLHHSNTDRIWAVWQALQEYRGLPYNHANCEIQAMKTPLRPFSDDINHNPVTKAN 
AKPLDVFEYNRLSFQYDNLIFHGYSIPELDRVLEERKEEDRIFAAFLLSGIKRSADWFDICQPEH 
ECVFAGTFAILGGELEMPWSFDRLFRYDITKVMKQLHLRHDSDFTFRVKIVGTDDHELPSDSVKAP 
TIEFEPG 

DOMAIN H 

VHRGGNHE DEHHDDRLADVL I RKEVDFL S LQEANAI KDAL YKLQNDDS KGG FE AI AG YHG Y PNMC P 
ERGTDKYPCCVHGMPVFPHWHRLHTIQMERALKNHGSPMGIPYWDWTKKMSSLPSFFGDSSNNNPF 
YKYYIRGVQHETTRDINQRLFNQTKFGEFDYLYYLTLQVLEENSYCDFEVQYEILHNAVHSWLGGT 
GKYSMSTLEHSAFDPVFMIHHSSLDRIWILWQKLQKIRMKPYYALDCAGDRLMKDPLHPFNYETVN 
EDEFTRINSFPSILFDHYRFNYEYDNMRIRGQDIHELEEVIQELRNKDRIFAGFVLSGLRISATVK 
VFIHSKNDTSHEEYAGEFAVLGGEKEMPWAYERMLKLDISDAVHKLHVKDEDIRFRVWTAYNGDV 
VT TRLSQP FI VHRPAHVAHDI LVI PVGAGHDLPPKWVKSGTKVE FTP I DS SVNKAMVELGS YTAM 
AKCIVPPFSYHGFELDKVYSVDHGDYYIAAGTHALCEQNLRLHIHVEHE 
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Figure 6 

Genomic sequence of the HtH2 gene 



DOMAIN 2A-1 (1st part of domain a) 
[domain a, parts 1-4: SEQ ID NO: 156] 

GGTCTTCCGTACTGGGACTGGACGCAGCATCTGACTCAACTCCCAGATCTGGTGTCAGACCCCTTG 
TTTGTCGACCCGGAAGGAGGAAAG 

INTRON 2A-1/2A-2 (SEQ ID NO: 125) 

GTAAGGGATCTCAGATCCGTCAGAGTGAGTGAGTGAGTGAGTGAGTGCCCAGCAACTGAAGCTAGG 
CCGCCCTACTGGGGATCACAGGGAATGTATGTCAATGGTTGAAGAAAGGAGCAGTGGGTTACAACG 
CCGCGTTCAAAGTCATGGCAGTTTCATAGCGCATTGTGCGCGCGTGTGTATCTGTGTGCGCGCGTG 
TGTGCTTGCGTGCGTGTGAGTGAGTCCGCTTGTGCATTTGTACTAGCACAGACTAATGCTGGTTCT 
AG AG AG C C T AC T G AT AAAT G T T T AC AT T AAGAT C T T T AC AG TAT AC T GAG AT T C GAG C C C AG AC C A 
GCGGAACACCAGGCAGGGTAACAACAAATAACGCCTTTCCACACAACCGACGCAGCCTAAAGTGGC 
TCTGATAGGCTGATACCGGTGTATTCTTAGAACTTGTAATTTGTGCTTTGCCATAATACATGTACT 
TCAGTTAACTGTAATACAGCATAAGACTGGACCGGTGTTTACGACGCAATGAGCAATAATTACTCT 
AC GAAAAG AT T T GG T TAG AC AT AT T C AAT AAT T G T AAC AT T CAT T AAC AAT GAAC AC C AC G T G C AC 
TCTCGTTTGTGTCAACGTATTCATAATCATTCTCATGCATCTGTTAGCTCAGATATTTTGATGTTT 
CAAGAGATTTGTACGAACGTATGGGCTGGTGCCCCATGAAATTACATACAATGAATTCAGGTGAAA 
TACCTGGCGAGACAATAAGATCTTACTAGTGCTGCCACTTCAGTATGGTGTCCCCGATGGTGTCTG 
GTGTATGGGTGTGTTTGGCGTCAGTTGTTACTGGAAAAGTCAGCTCTAATTATGTCTTTATGTGGT 
TAAAGACCCCATAACCTAGATGTCTGGGTTTAACTTAACATGATAGTAACAGTCGGCTGTATAGCC 
T G AC G C T T AAAC G T TAG AT GAAT AAG G AC TATATTGTGTTG TAT AAC AT T T C TAT AAC CTCCTTTC 
TATATCATTTAG 

DOMAIN 2A-2 (2nd part of domain a) 

G C C CAT G AC AAC GCATGGTATCGTG GAAAC AT C AAG T T T G AG AAT AAGAAG AC T G C AAG AG C T G T T 
GACGATCGCCTTTTCGAGAAGGTTGGACCAGGAGAGAATACCCGACTCTTTGAAGGAATTCTCGAT 
GCTCTTGAACAGGATGAATTCTGCAACTTCGAGATCCAGTTTGAGTTGGCTCACAACGCTATCCAC 
TACCTGGTTGGCGGCCGTCACAC 

INTRON 2A-2/2A-3 (SEQ ID NO: 126) 

GTGAGTCACGTTCTCTGATGGTCACGAGTCACGTTCTCTGATGGTCACGAGTCACGTTCTCTGATG 
GTCACGAGTCACGTTCTCTGATGGTCACGAGTCACATTCTCTGATGGTCACGAGTCACATTCTCTG 
TTGAGTGAAGTCTCAGTACCATTTATTTCTCTTACCTTCTTCTAACCAGGGGTTTCAGCGTGGATC 
G T C T G AGAAG T TAG C G C AAAT C T AT AT T G AAG TCATTTTTCTAT CAT AT AAC CAT C G T TAT AT CCA 
CGTGCGAAAGTGTTCATTAATTATTTTTATTTTCATTTATGAAGGTCTAAAAGAAAATATGTATTG 
TTGGAAACTATATTCGAAGGTGAAGGCAACACGAGTGTATTAATATTCTCAATATCAATGTACGCT 
CTGTCAGCACCTGTTTCACCAGGAACTACACCTTTAGCGTACCAAAATATCAGCTGATGATTTCGA 
AGCGGACTATACCCTCACCACTTGTTTTGTGTGTGTATTTATGTGTGCATGTGTGTGCGTGCGTGC 
GTGTGTGTGTGTGTCCTACGTATGTTGATATTTTGTTCTGACTGTATATGTTCGTGCTTACCATTG 
AAG 
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GTACTCCATGTCTCATCTCGAGTACACCTCCTACGACCCCCTCTTCTTCCTCCATCACTCCAACAC 
CGACCGCATCTTCGCCATCTGGCAACGTCTTCAGGTACTCAGAGGAAAGGACCCCAACACCGCCGA 
CTGCGCACACAACCTCATCCATGAGCCCATGGAACCGTTCCGTCGGGACTCGAACCCTCTTGACCT 
CACCAGGGAAAACTCCAAACCAATTGACAGCTTTGATTATGCCCACCTTGGCTACCA 

INTRON 2A-3/2A-4 (SEQ ID NO: 127) 

G TAT G T AT GAT T C T AAT AAT G AAT G T T T T T AC CTCCGGTT T AAAC AAT AT T T TAG TAT T AC G AAAG 
G AGAAG T AC C T C G AG AGG T C TAG G T C T C AG AT G T T T AGAAAC C CAT G AAG AC AG G TAT GC T T C T GA 
AAAACAAAGTAACATCATGAGGCTAAAGTTCAGATTCAAACCATCGTAGTTCGAATCCAGCATGCA 
AAGGGCCCTAACCCTGTAGATGGCGCTGCTTGAAACAGAGTAGTCTGTTCAGGGTCAGTACTGTCC 
CCACAAACATCATAGTCAGGGTCAGTACTGTCCCCACAAACATCATAGTCAGGGTCAGTACTGTCC 
CCACAAACATCACAGTCAGGGTTAATTTTGGATTCGGTTTCGAATGCGAAGAAGACAGTCACGCCC 
TGACACTGGACCGAGGTTGCCGAGAAAGCTCGTGATATTGCTGGAATACTGCCCAGTAAAACCATC 
AT T TAT T T T AGGC TAT T TAT TAC GAAAAAT AATAATATG TATAGAAAT GCATAT GAT CGCTGTTTG 
AATGTAAAATTTAGAATGGGTTTGGGAGTGTTCACTATTTTTTCATCAAAATTTCATGTATTTTAA 
CCGATCGACGCTGAAGACAAACTACCGTTAATCAGGCAGTTCATTCATATCTGATAGGGAATATTG 
GTTGTTAACCAACGCTACATTGTGTCCAG 

DOMAIN 2A-4 (4th part of domain a) 

G T AT GAT G AC T T G AC C C T G AAC G G T AT G AC C C C AGAG GAAT T G AAC T CAT AT C T GC AT G AAC G G T C 
AGGCAAGGAGGGGGTGTTCGCAAGCTTCCGACTCTCAGGTTTTGGCGGCTCTGCTAACGTTGTTGT 
CTACGCATGCCGTCCTGCCCACGATGAAATGGCTGTCGATCAGTGCGACAAAGCCGGCGACTTCTT 
TGTGTTGGGCGGACCCACCGAGATGCCCTGGAGGTTTTACAGAGCATTCCACTTCGACGTCACCGA 
CAGCATCGACAACATCGACAAGGACCGCCACGGCCACTATTATGTAAAGGCGGAATTATTCAGTGT 
AAATGGAAGTGCGCTACCGAATGATCTCCTGCCTCAACCCACCATCTCACACAGGCCAGCCCGCGG 
AC AC G T T GAT G 

INTRON 2A-4/2B (SEQ ID NO: 128) 

GTAAATGGCCATTGTATACATGCATTCATTTGGACTTTGAGTGAGTGAGTGGATGCGTATTCAGTA 
AGTGAGAGTGTGAGTGGGTATTAGGTCTGTGAGTGGGTTGGTGAGTGGATGGGTGAGTAAGAGTGG 
GTTGGTGAGAAAGTGAGTGAGTCACTTGGTGGGTGCGTTAGTGGAAGCGTGATTGAGTGGATGGGA 
GGTAGGTGAGTGAGTGAATTGGTGGGGGGGTGAGTGAGGTTAACGCTGTTCTGCTGTTCAATCACA 
CCACATGTTGCCAGCTTACTGTGCAGGACGAATCCAGGGTTGTGTTAAATTTTATATGTTTATATA 
TAACGATGGACGTGTCTGGATGTGGCGAATGTGTCAAGAGAATTATGCGGCTTTGTGCTGCTCCGC 
GTATTTATTGCACGCGCGTTGGTACGCGGTTGATAAAGTAGTTCAAAACATTTCCCAGCCATCTTT 
GTCTGTTGTGAAAACCTACTCCAGGACCATCCATTTCAATATGTGTCTGCGTTCATGGAGTTATAC 
AT G T T AAAC T G TAG AG C G C AG AT G AGC AC AC T T GAG CAT T T C T T C AG T AAAT C AG AAT G T G TAT AT 
TTCAAAATTTACCAAATGCAATATCATCAAGCAAATTATGCAGCTCTATAGTAACATCGGAGTCAA 
TGGTCCAGTGTGCCCTCGGCTGCCATTCCGACCTCCCTGGCCAGAATACACCCCGGTCAGGATCAG 
TTATCCGTCAGAAGGCACGGTGCGGAATGAAAACATAAACACATAGTCGCTTAGTAGTATGCTGAT 
T TAG G C AC G C AAAAT C C GAAT G T GAAT TAC T G T GAAT T G C AT TAC C T G T TAC AG 

DOMAIN 2B 

AGGCCCCAGCTCCCTCCTCGGATGCTCACCTCGCCGTCAGGAAGGATATCAACCATCTGACACGCG 
AGGAGGTGTACGAGCTGCGCAGAGCTATGGAGAGATTCCAGGCCGACACATCCGTTGATGGGTACC 
AGGCTACGGTTGAGTATCACGGCTTACCTGCTCGATGTCCATTCCCCGAGGCCACAAATAGGTTCG 
CCTGTTGCATCCACGGCATGGCGACATTCCCTCATTGGCACAGACTGTTCGTTACCCAGGTGGAAG 
ATGCACTGATCAGGCGAGGATCCCCTATAGGGGTCCCCTACTGGGACTGGACTCAGCCTATGGCAC 
AT C T C C C AG G AC T T G C AG AC AAC G C C AC C T AT AG AGAT C C C AT C AG C G GAG AC AG C AG AC AC AAC C 
CGTTCCACGATGTTGAAGTTGCCTTTGAAAATGGGCGTACAGAACGTCACCCAGATAGTAGATTGT 
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TTGAACAACCTCTATTTGGCAAACATACGCGTCTCTTCGACAGTATAGTCTATGCTTTTGAGCAGG 
AGGACTTCTGCGATTTTGAAGTTCAATTTGAGATGACCCATAATAATATTCACGCCTGGATTGGTG 
GCGGCGGGAAGTATTCCATGTCTTCTCTACACTACACAGCCTTCGACCCTATCTCCTACCTTCATC 
ACTCCAACACTGACCGTCTCTGGGCAATTTGGCAAGCGTTGCAGATACGAAGAAACAAACCGTATA 
AGGCTCATTGTGCTTGGTCTGAGGAACGCCAGCCTCTCAAACCTTTCGCCTTCAGTTCCCCACTGA 
ACAACAACGAAAAAACCTACGAAAACTCGGTGCCCACCAACGTTTACGACTACGAAGGAGTCCTTG 
GCTATACTTATGATGACCTCAACTTCGGGGGCATGGACCTGGGTCAGCTTGAGGAATACATCCAGA 
GGCAGAGACAGAGAGACAGGACCTTTGCTGGCTTCTTTCTGTCACATATTGGTACATCAGCGAATG 
TTGAAATCATTATAGACCATGGGACTCTTCATACCTCCGTGGGCACGTTTGCTGTTCTTGGCGGAG 
AG AAG GAG AT GAAAT G G G GAT T T G AC C G T T T G T ACAAAT AT GAG AT T AC AG AT GAAC T G AGG C AAC 
TTAATCTCCGTGCTGATGATGGTTTCAGCATCTCTGTTAAAGTAACTGATGTTGATGGCAGTGAGC 
TGTCCTCTGAACTCATCCCATCTGCTGCTATCATCTTCGAACGAAGCCATA 

INTRON 2B/2C (SEQ ID NO: 129) 

GTAAGTAGCTACCTGTTTATTCAATTTTTTCGCTTTGCCAATCAATTCATTCAGCTTGAAATTCAA 
TAATTGTGTTTTGCATGGCTGAAAACCAATTTGAACTCTTTTCTTTTCTCAGGTCGAACTCAAATA 
AATAATCACTAATTGTTATGCACGCGGGTAGGGCATACATACTATATCCACATCGGTCATCTCAAA 
ATGCAAACAAATTGTCTTATTTCCGTTGGGACAAGCAAACCCCCTTTCCTGTAATCTTGCCTTTGG 
CATCCACTGGAATTAATGTTGACTGGTAATTGATACTGGCTCTCTTCTTGCATAGAGTTAATATCT 
AT AG T T T G T AAAT C T T TAT GAT TTTGCTATT TAT AT T T C G AC AG CAT G C T AT AGAC AC C C T AGAC T 
AT T GT AT AGC C AC T T G TAT TGTTTTTC CAT T TAT TAT T T AT AAC AGAAC AT GGC T T G T AAT T T T T A 
TTTACCTTCCAG 

DOMAIN 2C 

T T GAC CAT C AG GAC C C T CAT C AG GAC AC AAT CAT C AGG AAAAAT G T T GAT AAT C T T AC AC C C GAGG 
AAATTAATTCTCTGAGGAGGGCAATGGCAGACCTTCAATCAGACAAAACCGCCGGTGGATTCCAGC 
AAATTGCTGCTTTTCACGGGGAACCCAAATGGTGCCCAAGTCCCGATGCTGAGAAGAAGTTCTCCT 
GCTGTGTCCATGGAATGGCTGTCTTCCCTCACTGGCACAGACTCCTGACCGTGCAAGGCGAGAATG 
CCCTGAGAAAGCATGGATGTCTCGGAGCTCTCCCCTACTGGGACTGGACTCGGCCCCTGTCTCACC 
TACCTGATTTGGTAAGTCAGCAGAACTACACCGATGCCATATCCACCGTGGAAGCCCGAAACCCCT 
GGTACAGCGGCCATATTGATACAGTTGGTGTTGACACAACAAGAAGCGTCCGTCAAGAACTGTATG 
AAGCTCCCGGATTTGGTCATTATACTGGGGTCGCTAAGCAAGTGCTTCTGGCTTTGGAGCAGGATG 
ACTTCTGTGATTTTGAAGTCCAGTTTGAGATAGCTCACAATTTCATCCACGCTCTTGTCGGCGGAA 
GCGAGCCATATGGTATGGCGTCACTCCGTTACACTACTTATGATCCAATTTTCTACCTCCATCATT 
CTAACACTGACAGACTCTGGGCTATATGGCAGGCTCTACAAAAGTACAGGGGCAAACCTTACAATT 
CCGCCAACTGTGCCATTGCTTCTATGAGAAAACCCCTACAGCCCTTTGGTCTGACTGATGAGATCA 
AC C C G GAT GAT GAG AC AAG AC AG CAT GCTGTTCCTTT C AG T G T C T T T GAT T AC AAGAAC AAC T T C A 
ATTATGAATATGACACCCTTGACTTCAACGGACTATCAATCTCCCAGCTGGACCGTGAACTGTCAC 
GGAGAAAGTCTCATGACAGAGTATTTGCCGGATTTTTGCTGCATGGTATTCAGCAGTCTGCACTAG 
TTAAATTCTTTGTCTGCAAATCAGATGATGACTGTGACCACTATGCTGGTGAATTCTACATCCTTG 
G T GAT G AAGC T GAAAT G C CAT GGGGCTAT GAT C G T C T T T AC AAAT AT GAG AT C AC T GAG C AG C T C A 
AT G C C C T G GAT C T AC AC AT C G GAG AT AG AT T C T T CAT C AG AT AC G AAGC G T T T GAT C T T CAT G G T A 
CAAGTCTTGGAAGCAACATCTTCCCCAAACCTTCTGTCATACATGACGAAGGGGCAG 

INTRON 2C/2D (SEQ ID NO: 130) 

G T G AGAAC AT T GAT AAT AG T T C AAAT GAAG TAT AT C C GAT T C AAGC T G T C GAT AC AAG AT GAGAT A 
CAT AAT C AC AAT GTTTGTAT TAG AT AT C T C T C T T AAT T T AAT GCCGCTTT TAT C AAT AT T C GAG C A 
AT C C T T C AG C AAC AT AC AC C AG C AAAT G T T T CAT C AAC AG AC TAT AT TAT T T AAT AT T T T AAAAAT 
CCTTCTCTGTTGTTATAAATACTTAAAGTATCGAATTCCTTGAATGCGTCTTCTCTGCAGCATATA 
GTTAAGTTGTTGTGTTTCTCTGTCAG 
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DOMAIN 2D 

GTCACCATCAGGCTGACGAGTACGACGAAGTTGTAACTGCTGCAAGCCACATCAGAAAGAATTTAA 
AAGATCTGTCAAAGGGAGAAGTAGAGAGCCTAAGGTCTGCCTTCCTGCAACTTCAGAACGACGGAG 
TCTATGAGAATATTGCCAAATTCCACGGCAAGCCTGGGTTGTGTGATGATAACGGTCGCAAGGTTG 
CCTGTTGTGTCCATGGAATGCCCACCTTCCCCCAGTGGCACAGACTCTATGTCCTCCAGGTGGAGA 
ATGCTTTGCTGGAGAGAGGATCTGCCGTCTCTGTGCCATACTGGGACTGGACTGAAACATTTACAG 
AGCTGCCATCTTTGATTGCTGAGGCTACCTATTTCAATTCCCGTCAACAAACGTTTGACCCTAATC 
CTTTCTTCAGAGGTAAAATCAGTTTTGAGAATGCTGTTACAACACGTGATCCCCAGCCTGAGCTGT 
ACGTTAACAGGTACTACTACCAAAACGTCATGTTGGCTTTTGAACAGGACAACTACTGCGACTTCG 
AGATACAGTTTGAGATGGTTCACAATGTTCTCCATGCTTGGCTTGGTGGAAGAGCTACTTATTCTA 
TTTCTTCTCTTGATTATTCTGCATTCGACCCTGTGTTTTTCCTTCACCATGCGAACACAGATAGAT 
TGTGGGCCATCTGGCAGGAGCTGCAGAGGTACAGGAAGAAGCCATACAATGAAGCGGATTGTGCCA 
TTAACCTAATGCGCAAACCTCTACATCCCTTCGACAACAGTGATCTCAATCATGATCCTGTAACCT 
T T AAAT AC T C AAAAC C C AC T GAT G G C T T T G AC T AC C AG AAC AAC T T T G G AT AC AAG TAT G AC AAC C 
TTGAGTTCAATCATTTCAGTATTCCCAGGCTTGAAGAAATCATTCGTATTAGACAACGTCAAGATC 
GTGTGTTTGCAGGATTCCTCCTTCACAACATTGGGACATCCGCAACTGTTGAGATATTCGTCTGTG 
TCCCTACCACCAGCGGTGAGCAAAACTGTGAAAACAAAGCCGGAACATTTGCCGTACTCGGAGGAG 
AAACAGAGATGGCGTTTCATTTTGACAGACTCTACAGGTTTGACATCAGTGAAACACTGAGGGACC 
TCGGCATACAGCTGGACAGCCATGACTTTGACCTCAGCATCAAGATTCAAGGAGTAAATGGATCCT 
ACCTTGATCCACACATCCTGCCAGAGCCATCCTTGATTTTTGTGCCTGGTTCAA 

INTRON 2D/2E (SEQ ID NO: 131) 

G T AAG AAAG T T T C AC T G T C T AAAT CTTTTTTTAT GAT AG AG G G T AG AGAAG T G G AGAC AAT G T G AC 
AAT AT AT T GAAT AAAG T T G T T TAAAAT T TAT AAC T C T CAT AAG T T CAT AT T AT GC T GAAGC T G TAG 
CC AT C TAT AAC T G T G TAAC AT GAAAT G T T AAGAC AT T AAC C T AAAT AC T T C AG C T GAT AAC AAAAC 
AAT G T T AAT AC AT AC G T C AAT G T AAC AT TTTCTTATCTT TAG G T TAT AG C AT AAAC AC T T C AGAG A 
T AC AG T G AC G AAAAC C T C T AT T T AAAT AT T T C AG 

DOMAIN 2E 

GTTCTTTCCTGCGTCCTGATGGGCATTCAGATGACATCCTTGTGAGAAAAGAAGTGAACAGCCTGA 
CAACCAGGGAGACTGCATCTCTGATCCATGCTCTGAAAAGTATGCAGGAAGACCATTCACCTGATG 
GGTTCCAAGCCATTGCCTCTTTCCATGCCCTGCCACCACTCTGCCCTTCACCATCTGCAACTCACC 
GTTATGCTTGCTGTGTCCACGGCATGGCTACATTTCCCCAGTGGCACAGACTGTACACTGTACAGT 
TCCAGGATGCACTGAGGAGACATGGAGCTGCAGTAGGTGTACCGTATTGGGATTGGCTGCGACCGC 
AG T C T C AC C T AC C AG AG C T T G T C AC CAT G GAG AC AT AC CAT GAT AT T T G GAG TAAC AGAG AT T T C C 
C CAAT C C T T T C T AC CAAGC C AAT AT T GAG T T T GAAG GAGAAAAC AT T AC AAC AGAGAG AGAAG T C A 
TTGCAGACAAACTTTTTGTCAAAGGTGGACACGTTTTTGATAACTGGTTCTTCAAACAAGCCATCC 
TAGCGCTTGAGCAGGAAAACTACTGTGACTTTGAGATTCAGTTTGAAATTCTTCACAACGGCGTTC 
ACACGTGGGTCGGAGGCAGTCGTACCCACTCTATCGGACATCTCCATTACGCATCCTACGACCCTC 
T T T T C T AC C T C C AC CAT T C C C AG AC AG AC C G T AT T T G GG C AAT C T G G C AAG AAC T C C AG G AAC AG A 
GAGGGCTCTCAGGTGATGAGGCTCACTGTGCTCTCGAGCAAATGAGAGAACCATTGAAGCCTTTCA 
GCTTCGGCGCTCCTTATAACTTGAATCAGCTAACACAGGATTTCTCCCGACCCGAGGACACCTTCG 
ACTACAGGAAGTTTGGTTATGAATATGACAATTTAGAATTCCTAGGAATGTCAGTTGCTGAACTGG 
AT CAAT AC AT TAT T G AAC AT CAAGAAAAT GAT AGAG TAT TCGCTGGGTTCCTGTT GAG T GGAT T C G 
GAGGTTCCGCATCAGTTAATTTCCAGGTTTGTAGAGCTGATTCCACATGTCAGGATGCTGGGTACT 
TCACCGTTCTTGGTGGCAGTGCTGAGATGGCGTGGGCATTTGACAGGCTATACAAATATGACATTA 
CTGAAACTCTGGAGAAAATGCACCTTCGATATGATGATGACTTCACAATCTCTGTCAGTCTGACCG 
CCAACAACGGAACTGTCCTGAGCAGCAGTCTAATCCCAACACCGAGTGTCATATTCCAGCGGGGAC 
ATC 
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INTRON 2E/2F-1 (SEQ ID NO: 132) 

G T AAG TAG T AAAC T GC T C AGA T T G T T T T C AT AAT T AC T C C AC TAT T AAG T AAAAAG T AC TAG T AA T 
T C AAT AG T AC T G T T C AC AG AG AAAT G T AAC AC AAT AG AC C AC AG AG T C CAT T T G T T AAAC G C C T T T 
GGCTTGGTAAGTCTGAGATTTTGGTGACTGATGGAAAGCTAAAATATATTTTGACAG 

DOMAIN 2F-1 (1st part of domain f) 

GTGACATAAATACCAAGAGCATGTCAGCGAACCGTGTTCGCCGTGAGCTGAGCGATCTGTCTGCGA 
GGGACCCGTCTAGTCTCAAGTCTGCTCTGCGAGACCTACAGGAGGATGATGGCCCCAACGGATACC 
AGGCTCTTGCAGCCTTCCATGGGCTACCAGCAGGCTGCCATGATAGCCAGGGAAATGAG 

INTRON 2F-1/2F-2 (SEQ ID NO: 133) 

GTATATTTAAGTATTTTATCTTACGCATGACCCTGACCCTATTTATTTTTTTTTAATCCTCGGATT 
TGTTTAATCCTGTTACCAGCGAAGGTCCGGGTTAGAATTGATCTTCAGTCAACTATTCTTGTCGTA 
GGACTAACGAGTTGTCTGGCTTGCTTACTCGGTTGACACGTGTCAACGGATCCCAATTGCAATTAG 
ATCGATGCTCATGCTGTTGATCCCTGGATTGCCTGGTCCGGACTCCACATACCGCCGCCATATTGC 
TGGTATATTGTCGAATGCGACGCTAAACAGCAAGCCAACCAACAATACTGAGACCTGGTGGTACAT 
GTCAGTTCTCTATTGCTGGGGTTCCAAACATAGCCATCAGTTGAAATATTTCATACATAGAAGAAT 
ACCTCTGAATATGATGATGAAACATTTACTTAGACTTGCCTGTGAGCCCCAGGCAAAATGCACTGT 
AAAAAT AC AC T GAC AGAGGAT T AGGC AT T C T T G GGAG TAC T G TAT AG T TAG T T GC AT AC AT AT TAG 
CGTTCCCTCACTAAAACGAATCTCTGAATGCTATCAATTAAAGATCATGATGCTTTGATTGTGTCT 
ACTGTATTTAAAATGGTGTTAAGATTTGCAATTACAATATACACAAACACGTTTCCTGCATCTCGG 
AGAATGCAATCTTTCGTTGTACGCGTCTGTTTTCATATTTTTATGCATGTAGTTTGCACTACTTAG 
C G T C C AAT AAAT C CAT T C AC AAAAT C AC AC AAAC AAAC GAT T T TAG G AAT G T GAC T G TAG C T G C AA 
CGAATATACCTGATCCTTTCTTGTTCCAG 

DOMAIN 2F-2 (2nd part of domain f) 

ATCGCATGTTGCATTCACGGTATGCCGACCTTCCCCCAGTGGCACAGACTGTACACCCTGCAGTTG 
GAGATGGCTCTGAGGAGACATGGATCATCTGTCGCCATCCCCTACTGGGACTGGACAAAGCCTATC 
TCCGAACTCCCCTCGCTCTTCACCAGCCCTGAGTATTATGACCCATGGCATGATGCTGTGGTAAAC 
AAC C CAT T C T C C AAAG G T T T T G T C AAAT T T G C AAAT AC C TAC AC AG T AAG AG AC C C AC AG GAG AT G 
CTGTTCCAGCTTTGTGAACATGGAGAGTCAATCCTCTATGAGCAAACTCTTCTTGCTCTAGAGCAA 
ACCGACTACTGTGATTTTGAGGTACAGTTTGAGGTCCTCCATAACGTGATCCACTACCTTGTTGGC 
GGACGTCAGACCTACGCATTGTCTTCTCTGCATTATGCATCCTACGACCCATTCTTCTTTATACAC 
CATTCCTTTGTGGATAAGATGTGGGTAGTATGGCAAGCTCTTCAAAAGAGGAGGAAACTTCCATAC 
AAGCGAG C T GAC TGTGCTGT C AAC C T AAT GAC T AAAC C AAT GAGG C CAT T T GAC T C C GAT AT GAAT 
C AGAAC C CAT T C AC AAAG AT G C AC G C AG T T C C C AAC AC AC T C TAT GAC TAC GAG AC AC T G TAC TAC 
AGCTACGATAATCTCGAAATAGGTGGCAGGAATCTCGACCAGCTTCAGGCTGAAATTGACAGAAGC 
AGAAGCCACGATCGCGTTTTTGCTGGATTCTTGCTTCGTGGAATCGGAACTTCTGCTGATGTCAGG 
TTTTGGATTTGTAGAAATGAAAATGACTGCCACAGGGGTGGAATAATTTTCATCTTAGGTGGAGCC 
AAGGAAAT G C CAT GGTCATTT G AC AGAAAC T T C AAG T T T GAT AT C AC C CAT G TAC T C GAG AAAG C T 
GGCATTAGCCCAGAGGACGTGTTTGATGCTGAGGAGCCATTTTATATCAAGGTTGAGATCCATGCT 
GTTAACAAGACCATGATACCATCGTCTGTGATCCCAGCCCCAACTATCATCTATTCTCCTGGGGAA 
G 

INTRON 2F-2/2G-1 (SEQ ID NO: 134) 

G T GAG AG AAC C AG T AAT AG C TAC T G T C TAC AAAG AAT G T G T T CAT T T AAAG AC C T GAC T G TAG G C C 
GATGGCTGCTGTCATCTCCTCCGCCTCCTCCTCCTGTTCCTCCTCCGAAGGGGTCAGCTTCAGGTT 
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CTCTTGCCAATATGCCAAGCAGACCTCCTGAGCAGGCAGTATATATACGTAAGGGAAGCAAGTATG 
GACCATCGCGCGGCATGTAGAGATACAATGATCAGCTGTCTGCTGTTCCACTCCTGTCAGACAATG 
AG AT AAAC AT G AAT AC AG TAT T AC T C AG C AG C G T T C C AAT T T T C AAC CCTCGTATT TAT T AAAAAA 
AGGAATTTTTAATATATTTTTCTCCTTGTTGAAATATTTTAGTAACTGTTAATCGATATAGAGTGG 
AGTAGTGACGCTTTATTTCGGTTCATTCTCGAAACAAAAATATAATAGTCCACTGAACTCTCTTAA 
ATTGTTTTTACAACCTTCAACTGCCACAGACGTAATCCCTCACGTTATTTTGAGCTGACAACGTGT 
TGAATTGAGTGTGTTCCGAATTCTAAATAAGCATGTATATATTTACGTCTCATGCAAGTAATATAT 
GTTTAACTGATGACGTCACTTGGTGACCACTGATTTAGTTCCTTTGTCATAATTGCAGTTTCTGTT 
GTCACGGGGACGGTGGGGAAGCCAGGTTCCTCCTGTCACGCTGAATATCCCGTTCGAATCCCCCAC 
ATGGGTACAAAGTGTGATGCCTATTTCTGGTGTCCCCCACCGTGATATTGCTGGAATAAGTGGCTT 
AATACCATATACACTCACTCTATTGTCACACTACTGCCACCGGCTCACACCTCTGATGCTTCTGTT 
CTATCCAG 

DOMAIN 2G-1 (1st part of domain g) 

GTCGCGCTGCTGACAGTGCACACTCAGCCAACATTGCTGGCTCTGGGGTGAGGAAGGACGTCACGA 
CCCTCACTGTGTCTGAGACCGAGAACCTAAGACAGGCTCTTCAAGGTGTCATCGATGATACTGGTC 
CCAATGGTTACCAAGCAATAGCATCCTTCCACGGAAGTCCTCCAATGTGCGAGATGAACGGCCGCA 
AGGTTGCCTGTTGTGCTCACG 

INTRON 2G-1/2G-2 (SEQ ID NO: 135) 

GTAATTAATGGATGTGAAGTCAATGTCCGAGGGTATAATAAGGATTTAAATACTTCAGTCGTGTAA 
TACTGTATGACATGTGTATTGGATGGTGTAGGTATTACAGGTTATAAGGCCAGTGTGTGTTGGGAC 
GGTTACTTTCCTGCACTAGTAATAAGCATTGTATTTAGCTAGCTTTTATCATATAACTTTAGTTTC 
ATGGTTTGTGGCAATTGAAATCGAAATTTTCTTTCATTTCAAGGTTATCGCACTCGTGTGTTAGAA 
TAG T T AC TAT GCTGCATT GAG AAT AAC AC TAT AG T AAT AAAG CAT AT CAT AC AG T AAG AAT AAC AC 
TAT AG T AAT AAAG TAT AT CAT AC AG T AAGAAT G T CAT T G TAT GAT AAAT AG G T TAT C AC AC T C GT G 
TGTTTTAGAATGGTTACTATCCCAGGAATAACCACTATGTATTACATGTATATTGGGCAGTGTAAG 
TAGTAGCATTGTATATTAAATCAGTATATCGTGCTTCAAAACACCAGGATATATGGGGTATACAGT 
GGGCAGTGTAAGTAGCAACATTGTATATTAAATCAGTATATCGTACTTCAAAACACCAGGATTATG 
GGGTATACAGTGGGCAGTGTAAGTAGTAGCATTGTATATTAAATCAGTATATCGTACTTCAAAACA 
CCAGGATATAATTCAGTATATCGTGCTTCAAAACACCAGGATATAATTCAGTATATCGTGCTTCAA 
AACACCAGGATATATGGGATATACAGTGCGGGTTTGCATACAACCTCCACCCTTTACAG 

DOMAIN 2G-2 (2nd part of domain g) 

GTATGGCCTCCTTCCCACACTGGCACAGACTGTATGTGAAGCAGATGGAAGACGCCCTGGCTGACC 
ACGGATCACATATCGGCATCCCTTACTGGGACTGGACAACTGCCTTCACAGAGTTACCCGCCCTTG 
T C AC AG AC T C C G AG AAC AAT C C C T T C CAT GAG 

INTRON 2G-2/2G-3 (SEQ ID NO: 136) 

GTCAGTTTAGTCTCCTGTCTGAGCTAACGATACCAATTTCCTATTTTCGAGAACCACGATGACGAG 
AAAAC AAGC AAT AT AGAT AT AGAT GC AG T AT AGAT CAAG T T AAT GAAT T CAT T G C TAT AT G T T T GC 
T T G T AAT AAAC T T T AAGAAAAC GAG AG C AT G C AC AC AAAT GAAAC AAAC AAT TAT G T G T T T GAT AG 
GAATATGATATATGTATTTGGGGGCTGACGTGAGCAGGGTTGAAGGGACAGTTTACATTGTCAGTA 
AC AC T G G GAG TAT T C T T T GAT C C AC AAT AT AT AG T T T CAT T G T G T T C AG C AG T T AC AAC T AAC AT T 
ATATCATACATTACGTCGTAACATGCTTCTTTTGTCCTCTTCTGCCAG 

DOMAIN G-3 (3rd part of domain g) 

GGTCGCATTGATCATCTCGGTGTAACCACGTCACGTTCCCCCAGAGACATGCTGTTTAACGACCCA 
GAGCAAGGATCAGAGTCGTTCTTCTATAGACAAGTCCTCCTGGCTTTGGAGCAGACTGACTACTGC 
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CAGTTCGAAGTCCAGTTTGAGCTGACCCACAACGCCATTCACTCCTGGACAGGTGGACGTAGCCCT 
TACGGAATGTCGACCCTCGAGTTCACAGCCTACGATCCTCTCTTCTGGCTTCACCACTCCAACACC 
GACAGAATCTGGGCTGTCTGGCAAGCACTGCAGAAATACCGAGGACTCCCATACAACGAAGCACAC 
TGTGAAATCCAGGTTCTGAAACAGCCCTTGAGGCCATTCAACGATGACATCAACCACAATCCAATC 
ACCAAGACTAATGCCAGGCCTATCGATTCATTTGATTATGAGAGGTTTAACTATCAGTATGACACC 
C T TAG C T T C CAT G G T AAG AG C AT C C C T G AAC T GAAT G AC C T G C T C GAG GAAAGAAAAAGAG AAG AG 
AGAACATTTGCTGCCTTCCTTCTTCGTGGAATCGGTTGCAGTGCTGATGTCGTCTTTGACATCTGC 
CGCCCCAATGGTGACTGTGTCTTTGCAGGAACCTTTGCTGTGCTGGGAGGGGAGCTAGAAATGCCT 
TGGTCCTTCGACAGACTGTTCCGCTATGACATCACCAGAGTCATGAATCAGCTCCATCTCCAGTAT 
GATTCAGATTTCAGTTTCAGGGTGAAGCTTGTTGCAACCAATGGCACTGAGCTTTCATCAGACCTC 
C TCAAGT CACCAACAAT T GAACAT GAAC T TGGAG 

INTRON 2G-3/2H (SEQ ID NO: 137) 

G TAT G T TAT C T TAT TAT C AAAT G T G T AAT C AG AT AC T G GAG AC G T T T T CAT AT T AAC T T G G T C AGC 
AT TAG T T GAT GAT TTTGGTGC GAT AT T G AC G AC AAG GAG T T AAGC AT T AAC AC G T T C AAC AC AT C T 
T T AAT C T GAT AT GAG AAG G GAAT AAAT T GAT C C AG T AT T GAT GAT T G AAG T TAG AT T AAC AG T G AA 
AGATATACCAGTTTTGATAATCGTATAAAACAGTAGCAGAATTGTATCGTGAAAACTAAATGTGGG 
AAGGCGAACGCCAAGCAGAT T T TAGAT TACGATCGTGTGC T AG AAT AAT T C AC AAT AAC C C AGAC G 
TCGGAAATGTGGTTGTCTATGGCAATAGTTACGATTAATTGCTAACATGCACGATTTACCTATTTC 
AG 

DOMAIN 2H 

C C C AC AG AG G AC C AG T T G AAG AAAC AG AAG T C AC T C AC C AAAAT AC T G AC G G CAAT G C AC AC T T C C 
ATCGTAAGGAAGTTGATTCGCTGTCCCTGGATGAAGCAAACAACTTGAAGAATGCCCTTTACAAGC 
T AC AG AAC G AC C AC AG T C T AAC AG GAT AC GAAG CAAT C T C T G G T T AC CAT G GAT AC C C GAAT C T G T 
GTCCGGAAGAAGGCGATGACAAATACCCCTGCTGCGTCCACGGAATGGCCATCTTCCCCCACTGGC 
ACAGACTCTTGACCATCCAACTGGAAAGAGCTCTCGAGCACAATGGTGCACTGCTTGGTGTTCCTT 
ACTGGGACTGGACCAAGGACCTGTCGTCACTGCCGGCGTTCTTCTCCGACTCCAGCAACAACAATC 
CCTACTTCAAGTACCACATCGCAGGTGTTGGTCACGACACCGTCAGAGAGCCAACTAGTCTTATAT 
AT AAC C AG C C C C AAAT C CAT G G T TAT GAT TATCTCTAT T AC C TAG C AT T G AC C AC G C T T GAAGAAA 
ACAATTACTGTGACTTTGAGGTTCAGTATGAGATCCTCCACAACGCCGTCCACTCCTGGCTTGGAG 
GATCCCAGAAGTATTCCATGTCTACCCTGGAGTATTCGGCCTTTGACCCTGTCTTTATGATCCTTC 
AC T C G G G T C TAG AC AG AC T T T G GAT CAT C TGGCAAGAAC T TCAGAAGATCAGGAGAAAGCCC TACA 
ACTTCGCTAAATGTGCTTATCATATGATGGAAGAGCCACTGGCGCCCTTCAGCTATCCATCTATCA 
ACCAGGACGAGTTCACCCGTGCCAACTCCAAGCCTTCTACAGTTTTTGACAGCCATAAGTTCGGCT 
AC CAT T AC GAT AAC C T GAAT G T TAG AG G T C AC AG CAT C C AAG AAC T C AAC AC AAT CAT CAAT G AC T 
TGAGAAACACAGACAGAATCTACGCAGGATTTGTTTTGTCAGGCATCGGTACGTCTGCTAGTGTCA 
AG AT CTATCTCC GAAC AG AT G AC AAT G AC GAAG AAG T T G GAAC T T T C AC TGTCCTGG GAG GAG AG A 
GGGAAATGCCATGGGCCTACGAGCGAGTTTTCAAGTATGACATCACAGAGGTTGCAGATAGACTTA 
AACTAAGTTATGGGGACACCTTTAACTTCCGACTAGAGATCACATCCTACGATGGATCGGTGGTAA 
AC AAG AGC C T AC C CAAT C C T T T CAT CAT C T AC AG AC C T G C CAAT CAT G AC T AC GAT GTTCTTGTTA 
TCCCAGTAGGAAGAAACCTTCACATCCCTCCCAAAGTTGTCGTCAAGAGAGGCACCCGCATCGAGT 
TCCACCCAGTCGATGATTCAGTTACGAGACCAGTTGTTGATCTTGGAAGCTACACTGCACTCTTCA 
ACTGTGTGGTACCACCGTTCACATACCGCGGATTCGAACTGAACCACGTCTATTCTGTCAAGCCTG 
G T G AC T AC TAT G T T AC C G G AC C AAC G AGAG AC C T T T G C C AG AAT G C AG AT G T C AGG AT T CAT AT C C 
AT G T T GAGGAT GAG T AA 

3 'UTR 



CGCAACAG 
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INTRON 3'UTR (SEQ ID NO: 138) 

GTGAGATAA.GAAACCCTTCTAA.CAGTAA.TACGACACCACATTACAGCTTAAACATGATTGCCATCG 
ATGTTTTCATGTGTAGTATACGCTTTTCAGTTCTACATAATTTTGTTTTTCAAATCAAGTTTAGCA 
AATGAATCTATCACTGGAAAATAGGGTAGGGTAGCCAAGTGGTTAAAGCGGTCACTGATCACGCCA 
AAGACGAGTGTCCTAACCTGCATGGGTACAAAAGTGAAGACCATTGCTGGTGTCTACCGCCGTAAT 
ATTGTTTT TAG TAT T G C T AAAAC T TAT AC T C AC C CAT G C G C T G T AAAAG T G G AAT AAT AAT CAT AT 
T T C AAC AAAAG C ACAAAAC CAT T T CAT T T T CAT GAAAG CCTCTTGTT C AC C T GAAAGAC G C AAGAG 
AACAATAGTTCCTAACATTATTTTCAGACATTGGAAATGTCCTGCACGTGTAAACCATATATCCTT 
T GAAAT T T T T ACGAC T GC AT C G TAT AC AAT T TAT GAT AT AAAT T T AAAAC T T TAT T T C AG 

3'UTR 

GTTTCTTGGTCTCCACATATTCACACATCAGCACCAAACGGTTTCGAAGGACATTGGCGTTCTTCT 
C T G G C AAT G CAT T T C AAT AC AAC AT T G AAAAT G AC T T C AG CAT AT C AG TGTGCTTC G AAC G T G T T C 
C G G AAG T AC T C AAAT G T G C T AT G AC T G AAT T AT T G T AC AT AC AT AAC T TAT T GAT G T T C AAT AAAT 
AAATGTTGAAACG 
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Figure 7 

Primary structure of the HtH2 protein 



DOMAIN A (SEQ ID NO: 156) 

GLPYWDWTQHLTQLPDLVSDPLFVDPEGGKAHDNAWYRGNIKFENKKTARAVDDRLFEKVGPGENT 
RL FE GIL DALE QDE FCNFEIQFELAHNAIHYLVGGRHTYSMSHLEYTSYDPLFFLHHSNTDRIFAI 
WQRLQVLRGKDPNTADCAHNL I HEPME P FRRDSNPLDLTRENSKP I DS FDYAHLGYQYDDLTLNGM 
TPEELNSYLHERSGKEGVFASFRLSGFGGSANVWYACRPAHDEMAVDQCDKAGDFFVLGGPTEMP 
WRFYRAFH FDVT DS I DN I DKDRHGH Y YVKAE L FS VNG SAL PNDLLPQPT I S HRPARGHVDE APAPS 
SDAHLAVRKDINHLTREEVYELRRAMERFQADTSVDGYQATVEYHGLPARCPFPEATNRFACCIHG 
MATFPHW 

DOMAIN B 

HRLFVTQVEDALIRRGSPIGVPYWDWTQPMAHLPGLADNATYRDPISGDSRHNPFHDVEVAFENGR 
TERHPDS RLFEQPL FGKHTRL FDS I VYAFE QEDFCDFEVQ FEMTHNN IHAW I GGGGKYSMS S LHYT 
AFDPISYLHHSNTDRLWAIWQALQIRRNKPYKAHCAWSEERQPLKPFAFSSPLNNNEKTYENSVPT 
NVYDYEGVLGYT YDDLNFGGMDLGQLEEYI QRQRQRDRT FAGFFLSH IGT SANVE III DHGTLHT S 
VGT FAVLGGEKEMKWGFDRL YKYE I TDELRQLNLRADDGFS I SVKVTDVDGSELS SELI PSAAI I F 
ERSH 

DOMAIN C 

IDHQDPHQDTI IRKNVDNLTPEEINSLRRAMADLQSDKTAGGFQQIAAFHGEPKWCPSPDAEKKFS 
CCVHGMAVFPHWHRLLTVQGENALRKHGCLGALPYWDWTRPLSHLPDLVSQQNYTDAISTVEARNP 
WYSGHIDTVGVDTTRSVRQELYEAPGFGHYTGVAKQVLLALEQDDFCDFEVQFEIAHNFIHALVGG 
SEPYGMASLRYTTYDPIFYLHHSNTDRLWAIWQALQKYRGKPYNSANCAIASMRKPLQPFGLTDEI 
NPDDETRQHAVPFSVFDYKNNFNYEYDTLDFNGLSISQLDRELSRRKSHDRVFAGFLLHGIQQSAL 
VKFFVCKSDDDCDHYAGEFYILGDEAEMPWGYDRLYKYEITEQLNALDLHIGDRFFIRYEAFDLHG 
TSLGSNI FPKPSVIHDEGA 

DOMAIN D 

GHHQADEYDEWTAASHIRKNLKDLSKGEVESLRSAFLQLQNDGVYENIAKFHGKPGLCDDNGRKV 
ACCVHGMPTFPQWHRLYVLQVENALLERGSAVSVPYWDWTETFTELPSLIAEATYFNSRQQTFDPN 
PFFRGKISFENAVTTRDPQPELYVNRYYYQNVMLAFEQDNYCDFEIQFEMVHNVLHAWLGGRATYS 
ISSLDYSAFDPVFFLHHANTDRLWAIWQELQRYRKKPYNEADCAINLMRKPLHPFDNSDLNHDPVT 
FKYSKPTDGFDYQNNFGYKYDNLEFNHFSIPRLEEI IRIRQRQDRVFAGFLLHNIGTSATVEIFVC 
VPTTSGEQNCENKAGTFAVLGGETEMAFHFDRLYRFDISETLRDLGIQLDSHDFDLSIKIQGVNGS 
YLDPHILPEPSLIFVPGSS 

DOMAIN E 

SFLRPDGHSDDILVRKEVNSLTTRETASLIHALKSMQEDHSPDGFQAIAS FHALPPLCPSPSATHR 
YACCVHGMATFPQWHRLYTVQFQDALRRHGAAVGVPYWDWLRPQSHLPELVTMETYHDIWSNRDFP 
NPFYQANIEFEGENITTEREVIADKLFVKGGHVFDNWFFKQAILALEQENYCDFEIQFEILHNGVH 
TWVGGSRTHSIGHLHYASYDPLFYLHHSQTDRIWAIWQELQEQRGLSGDEAHCALEQMREPLKPFS 
FGAPYNLNQLTQDFSRPEDTFDYRKFGYEYDNLEFLGMSVAELDQYIIEHQENDRVFAGFLLSGFG 
GSASVNFQVCRADSTCQDAGYFTVLGGSAEMAWAFDRLYKYDITETLEKMHLRYDDDFTISVSLTA 
NNGTVLSSSLIPTPSVIFQRGH 
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DOMAIN F 

RDINTKSMSANRVRRELSDLSARDPSSLKSALRDLQEDDGPNGYQALAAFHGLPAGCHDSQGNEIA 
CC I HGMPTFPQWHRL YTLQLEMALRRHGS S VAI P YWDWTKP ISELPSLFTSPE YYDPWHDAWNNP 
FSKGFVKFANTYTVRDPQEMLFQLCEHGESILYEQTLLALEQTDYCDFEVQFEVLHNVIHYLVGGR 
QTYALSSLHYASYDPFFFIHHSFVDKMWVWQALQKRRKLPYKRADCAVNLMTKPMRPFDSDMNQN 
PFTKMHAVPNTLYDYETLYYSYDNLEIGGRNLDQLQAEIDRSRSHDRVFAGFLLRGIGTSADVRFW 
I CRNENDCHRGG 1 1 FI LGGAKEMP WS FDRNFKFD I THVLEKAG I S PE DVFDAEE P FY I KVE I HAVN 
KTMIPSSVI PAP T 1 1 YS PGE 

DOMAIN G 

GRAADSAHSANIAGSGVRKDVTTLTVSETENLRQALQGVIDDTGPNGYQAIASFHGSPPMCEMNGR 
KVACCAHGMAS FPHWHRL YVKQMEDALADHGSHIGI PYWDWT TAFTELPALVTDSENNPFHEGRI D 
HLGVTTSRSPRDMLFNDPEQGSESFFYRQVLLALEQTDYCQFEVQFELTHNAIHSWTGGRSPYGMS 
TLEFTAYDPLFWLHHSNTDRIWAVWQALQKYRGLPYNEAHCEIQVLKQPLRPFNDDINHNPITKTN 
ARPI DS FDYERFNYQYDTLS FHGKS I PELNDLLEERKREERT FAAFLLRGI GCSADWFDICRPNG 
DCVFAGTFAVLGGELEMPWSFDRLFRYDITRVMNQLHLQYDSDFSFRVKLVATNGTELSSDLLKSP 
TIEHEL 

DOMAIN H 

GAHRGPVEETEVTHQNTDGNAHFHRKEVDSLSLDEANNLKNALYKLQNDHSLTGYEAISGYHGYPN 
LCPEEGDDKYPCCVHGMAIFPHWHRLLTIQLERALEHNGALLGVPYWDWTKDLSSLPAFFSDSSNN 
NPYFKYHIAGVGHDTVREPTSLIYNQPQIHGYDYLYYLALTTLEENNYCDFEVQYEILHNAVHSWL 
GGSQKYSMSTLEYSAFDPVFMILHSGLDRLWI IWQELQKIRRKPYNFAKCAYHMMEEPLAPFSYPS 
INQDEFTRANSKPSTVFDSHKFGYHYDNLNVRGHSIQELNTI INDLRNTDRIYAGFVLSGIGTSAS 
VKIYLRTDDNDEEVGTFTVLGGEREMPWAYERVFKYDITEVADRLKLSYGDTFNFRLEITSYDGSV 
VNKSLPNPFIIYRPANHDYDVLVIPVGRNLHIPPKVWKRGTRIEFHPVDDSVTRPWDLGSYTAL 
FNCWP P FT YRG FE LNHVY S VK P G D Y YVT G P T RDL C QNAD VR I H I HVE DE 
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Figure 8 

Genomic sequence of the KLH1 gene 



DOMAIN IB 

GGCCTACCGTACTGGGACTGGACTGAACCCATGACACACATTCCGGGTCTGGCAGGAAACAAAACT 
TATGTGGATTCTCATGGTGCATCCCACACAAATCCTTTTCATAGTTCAGTGATTGCATTTGAAGAA 
AATGCTCCCCACACCAAAAGACAAATAGATCAAAGACTCTTTAAACCCGCTACCTTTGGACACCAC 
AC AG AC C T G T T CAAC C AG AT T T T G TAT G C C T T T G AAC AAG AAGAT T AC T G T G AC T T T GAAG T C C AA 
TTTGAGATTACCCATAACACGATTCACGCTTGGACAGGAGGAAGCGAACATTTCTCAATGTCGTCC 
CTACATTACACAGCTTTCGATCCTTTGTTTTACTTTCACCATTCTAACGTTGATCGTCTTTGGGCC 
GTTTGGCAAGCCTTACAGATGAGACGGCATAAACCCTACAGGGCCCACTGCGCCATATCTCTGGAA 
CAT AT G CAT C T G AAAC CAT TCGCCTTTTCATCTCCCCT T AAC AAT AAC GAAAAG AC T C AT G C C AAT 
G C CAT G C C AAAC AAG AT C T AC GAC TAT GAAAAT G T C C T C CAT T AC AC AT AC GAAGAT T T AAC AT T T 
G GAG G CAT C T C T C T G G AAAAC AT AG AAAAG AT GAT C C AC GAAAAC C AG C AAGAAG AC AGAAT AT AT 
GCCGGTTTTCTCCTGGCTGGCATACGTACTTCAGCAAATGTTGATATCTTCATTAAAACTACCGAT 
TCCGTGCAACATAAGGCTGGAACATTTGCAGTGCTCGGTGGAAGCAAGGAAATGAAGTGGGGATTT 
GATCGCGTTTTCAAGTTTGACATCACGCACGTTTTGAAAGATCTCGATCTCACTGCTGATGGCGAT 
T T C GAAG T T AC T G T T GAC AT C AC T GAAG T C GAT GGAAC TAAAC T T GCAT C C AGT C T TAT T C C ACAT 
GCTTCTGTCATTCGTGAGCATGCACGTGGTAAGCTGAATAGAG 

INTRON 1B/1C(SEQ ID NO: 139) 

GTTTTGTAATAATTATGTAGAATTCTTTACCTCAGAATAAGATGAGGTCACATGGGTTTTGCAAAA 
C T AT T AC G T T C G AAT T AAT AT T AAT AAT AC C G GAC C C T C C AC T G G T AC AT AT T TAT C T T TAT AAC G 
AT AAT AG C GAT GAT GAT GAT GAT GAT GAT GAT GAT GAT GAT GAT GAT AAT GAT GAT G C C G G TAT T G 
C AC G T AAT C C AG C C GAC T T AGAT GAC AC C C T AAG G G T G C AG AAAG TAT AAC AAT TAG AT T G C G T T T 
GCATCTGTGTATGCGTGTGCTTTAACCAAAAGTCAAAATAAAAGTGCAAACCCTTAGTTTATTCAT 
T T GAT AG AG C C T T T T AC G AT AAG AAC AAT G T AAT AAAT T AGAAC AT AAC T G AAAC C T C C GAAAGAA 
GGCCTGTTTGTCAAGAGAGGTATCGACATGATTGACTTATAAACCTGTGCTTCTATATTTTGGAAC 
TGTCCACTTTCTTGTTGTGTGTACTGTAATCACATCGCACTATGGCTGCAAGACGTGTACGAGTAC 
AC TAT AT AC T T AC C T AAT GAC CAAC C AC AAG GC T GGCTTTGT T AAT AT T G T TAT T T C AC AGAAAT A 
AACACAGAATTCCAGCATTTGGCTGGTGTATTTAGCAAAACACCGATATGACACTCATGTTTTATT 
ACATTTTTTTCAG 

DOMAIN 1C 

TTAAATTTGACAAAGTGCCAAGGAGTCGTCTTATTCGAAAAAATGTAGACCGTTTGAGCCCCGAGG 
AGATGAATGAACTTCGTAAAGCCCTAGCCTTACTGAAAGAGGACAAAAGTGCCGGTGGATTTCAGC 
AGCTTGGTGCATTCCATGGGGAGCCAAAATGGTGTCCTAGTCCCGAAGCATCTAAAAAATTTGCCT 
GCTGTGTTCACGGCATGTCTGTGTTCCCTCACTGGCATCGACTGTTGACGGTTCAGAGTGAAAATG 
CTTTGAGACGACATGGCTACGATGGAGCTTTGCCGTACTGGGATTGGACCTCTCCTCTTAATCACC 
TTCCCGAACTGGCAGATCATGAGAAGTACGTCGACCCTGAAGATGGGGTAGAGAAGCATAACCCTT 
G G T T C GAT G G T CAT AT AG AT AC AG T C GAC AAAAC AAC AAC AAGAAG T G T T C AGAAT AAAC T C T T C G 
AACAGCCTGAGTTTGGTCATTATACAAGCATTGCCAAACAAGTACTGCTAGCGTTGGAACAGGACA 
ATTTCTGT GAC T T T G AAAT C C AAT AT GAG AT T G C C CAT AAC T AC AT C CAT G C AC T T G TAG GAG G C G 
CTCAGCCTTATGGTATGGCATCGCTTCGCTACACTGCTTTTGATCCACTATTCTACTTGCATCACT 
CTAATACAGATCGTATATGGGCAATATGGCAGGCTTTACAGAAGTACAGAGGAAAACCGTACAACG 
TTGCTAACTGTGCTGTTACATCGATGAGAGAACCTTTGCAACCATTTGGCCTCTCTGCCAATATCA 
AC AC AGAC CAT G T AAC C AAG GAG C AT T C AG T G C CAT T CAAC G T T T T T GAT T AC AAG AC C AAT T T C A 
ATTATGAATATGACACTTTGGAATTTAACGGTCTCTCAATCTCTCAGTTGAATAAAAAGCTCGAAG 
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CGATAAAGAGCCAAGACAGGTTCTTTGCAGGCTTCCTGTTATCTGGTTTCAAGAAATCATCTCTTG 
TTAAATTCAATATTTGCACCGATAGCAGCAACTGTCACCCCGCTGGAGAGTTTTACCTTCTGGGTG 
AT GAAAAC GAG AT G C CAT G G G CAT AC GAT AG AG T C T T C AAAT AT G AC AT AAC C GAAAAAC T C C AC G 
AT C T AAAG C T G CAT G C AG AAG AC C AC T T C T AC AT T GAC TAT GAAG T AT T T GAC C T T AAAC C AG C AA 
G C C T GG G AAAAG AT T T G T T C AAGC AG C C T T C AG T CAT T CAT G AAC C AAGAAT AG 

INTRON 1C/1D (SEQ ID NO: 140) 

G T AC T T G T TAT AT G T T T C GAAT AT T G C C GAT AC C T T C AAT AT AT AT AC T T T AT C AAAG T AAT T GAT 
T AAT C T GAAG T AAT TTTCCTTTC C AG T AGAG AT T C AG T T GAT AC AAC AAG AAT TCGCCCTGTTG T A 
TGTCACTTTATTTTCATCAAACGATTCGAAGTGAGCTGTCCATGCCACAATGGGGTCTCTGTAACT 
TTCTCGTATGGGGTATAGATTATATAGACGTGGCAGACCTTACGTATAACTAATATTTGTGTAATG 
TCGTTTCAG 

DOMAIN ID 

GTCACCATGAAGGCGAAGTATATCAAGCTGAAGTAACTTCTGCCAACCGTATTCGAAAAAACATTG 
AAAATCTGAGCCTTGGTGAACTCGAAAGTCTGAGAGCTGCCTTCCTGGAAATTGAAAACGATGGAA 
CTTACGAATCAATAGCTAAATTCCATGGTAGCCCTGGTTTGTGCCAGTTAAATGGTAACCCCATCT 
CTTGTTGTGTCCATGGCATGCCAACTTTCCCTCACTGGCACAGACTGTACGTGGTTGTCGTTGAGA 
ATGCCCTCCTGAAAAAAGGATCATCTGTAGCTGTTCCCTATTGGGACTGGACAAAACGAATCGAAC 
AT T T AC C T C AC C T GAT T T C AGAC G C C AC T T AC T AC AAT T C C AGGCAAC AT C AC TAT GAGACAAACC 
C AT T C CAT CAT G GC AAAAT C AC AC AC G AG AAT GAAAT C AC T AC T AGG G AT C C C AAG GAC AG C C T C T 
TCCATTCAGACTACTTTTACGAGCAGGTCCTTTACGCCTTGGAGCAGGATAACTTCTGTGATTTCG 
AGATTCAGTTGGAGATATTACACAATGCATTGCATTCTTTACTTGGTGGCAAAGGTAAATATTCCA 
TGTCAAACCTTGATTACGCTGCTTTTGATCCTGTGTTCTTCCTTCATCACGCAACGACTGACAGAA 
TCTGGGCAATCTGGCAAGACCTTCAGAGGTTCCGAAAACGGCCATACCGAGAAGCGAATTGCGCTA 
T C C AAT T GAT GC AC AC G C C AC T C C AGC C G T T T GAT AAG AGC GAC AAC AAT GAC GAG G C AAC GAAAA 
CGCATGCCACTCCACATGATGGTTTTGAATATCAAAACAGCTTTGGTTATGCTTACGATAATCTGG 
AAC T GAAT C AC T AC T C GAT TCCTCAGCTT GAT C AC AT G C T G C AAGAAAG AAAAAG G CAT GAC AGAG 
TATTCGCTGGCTTCCTCCTTCACAATATTGGAACATCTGCCGATGGCCATGTATTTGTATGTCTCC 
CAACTGGGGAACACACGAAGGACTGCAGTCATGAGGCTGGTATGTTCTCCATCTTAGGCGGTCAAA 
C G GAG AT GTCCTTTG TAT T T GAC AG AC T T T AC AAAC T T GAC AT AAC T AAAG C C T T GAAAAAG AAC G 
GTGTGCACCTGCAAGGGGATTTCGATCTGGAAATTGAGATTACGGCTGTGAATGGATCTCATCTAG 
AC AG T CAT G T CAT C C AC T C T C C C AC TAT AC T G T T T GAG G C C GG AAC AG 

INTRON ID/IE (SEQ ID NO: 141) 

G T AAC TAT T T T G T C AC T G T AAC C AAC AAC T G C AG T C TAT T T T GC AAT T AC GAT AAT AAC AAT T T T T 
GAAAT AT AT C T T TAT T AAAG C AAAG G T T T C TAG AG AC AAAC AG C C G G C T C T AAT TATTTTTTC GAA 
C T T AC G C T T GAG T AAAGAT C T G C AAAT G G C AAC C C T AC C TAT AC TAT T AAAAAT AT AAT G T T AC AT 
TCGTATCTGAATGTTTAATAAATCACTTCATATTCTGTTGCAG 

DOMAIN IE 

ATTCTGCC C AC AC AG AT GAT G GAC AC AC T GAAC C AG T GAT GAT T C GC AAAG AT AT C ACAC AAT T G G 
ACAAGCGTCAACAACTGTCACTGGTGAAAGCCCTCGAGTCCATGAAAGCCGACCATTCATCTGATG 
GGTTCCAGGCAATCGCTTCCTTCCATGCTCTTCCTCCTCTTTGTCCATCACCAGCTGCTTCAAAGA 
GGTTTGCGTGCTGCGTCCATGGCATGGCAACGTTCCCACAATGGCACCGTCTGTACACAGTCCAA.T 
TCCAAGATTCTCTCAGAAAACATGGTGCAGTCGTTGGACTTCCGTACTGGGACTGGACCCTACCTC 
G T T C T GAAT T AC C AG AG C T C C T GAC C G T C T C AAC T AT T C AT GAC C C G GAGAC AG G C AG AGAT AT AC 
C AAAT CCATTTATTGGTTC T AAAAT AG AG T T T GAAG G AG AAAAC G T AC AT AC T AAAAG AG AT AT C A 
ATAGGGATCGTCTCTTCCAGGGATCAACAAAAACACATCATAACTGGTTTATTGAGCAAGCACTGC 
TTGCTCTTGAACAAACCAACTACTGCGACTTCGAGGTTCAGTTTGAAATTATGCATAATGGTGTTC 
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ATACCTGGGTTGGAGGCAAGGAGCCCTATGGAATTGGCCATCTGCATTATGCTTCCTATGATCCAC 
TTTTCTACATCCATCACTCCCAAACTGATCGTATTTGGGCTATATGGCAATCGTTGCAGCGTTTCA 
GAGGACTTTCTGGATCTGAGGCTAACTGTGCTGTAAATCTCATGAAAACTCCTCTGAAGCCTTTCA 
GCT T TGGAGCACCATATAATC T T AAT GAT C AC AC G CAT GAT T TCTCAAAGCC TGAAGATACATTCG 
ACTACCAAAAGTTTGGATACATATATGACACTCTGGAATTTGCAGGGTGGTCAATTCGTGGCATTG 
ACCATATTGTCCGTAACAGGCAGGAACATTCAAGGGTCTTTGCCGGATTCTTGCTTGAAGGATTTG 
GCACCTCTGCCACTGTCGATTTCCAGGTCTGTCGCACAGCGGGAGACTGTGAAGATGCAGGGTACT 
TCACCGTGTTGGGAGGTGAAAAAGAAATGCCTTGGGCCTTTGATCGGCTTTACAAGTACGACATAA 
CAGAAACCTTAGACAAGATGAACCTTCGACATGACGAAATCTTCCAGATTGAAGTAACCATTACAT 
CCTACGATGGAACTGTACTCGATAGTGGCCTTATTCCCACACCGTCAATCATCTATGATCCTGCTC 
ATC 

INTRON IE/IF (SEQ ID NO: 142) 

G TAAG TAT AC AC AC AT TAT TTCTCTTCTGC TAT AT C AGAT GAAGAGAAC G T T G TAT C AC T AAC C T A 
GTCTTGTTTGATTTGTGGTTTCGTTTGCTTCCTGAACAGTAGGGTTGATTTAACTTCTCTGTTTCG 
TCTGTACCAATGAAAGACTATGATGCTTGTGTGAAGATGCTTTGTTCATGAGTCAGTCTGTTCTTG 
TAATGCTTTGATCTTTGCCATCAACATTCTTGAAATTAATTATGGTTTCCCTTAAATACTTACATA 
TTACATTTAAACGTCGCTGCTTGTCTGATTGCATATTCTTTCAAAAATAACTATATATTCCAG 

DOMAIN 1F-1 (1st part of domain f) 

ATGATATTAGTTCGCACCACCTGTCGCTCAACAAGGTTCGTCATGATCTGAGTACACTGAGTGAGC 
GAGATATTGGAAGCCTTAAATATGCTTTGAGCAGCTTGCAGGCAGATACCTCAGCAGATGGTTTTG 
CTGCCATTGCATCCTTCCATGGTCTGCCTGCCAAATGTAATGACAGCCACAATAACGAG 

INTRON 1F-1/1F-2 (SEQ ID NO: 143) 

GTAAATATACAGTGAAATCCGGATAAGTAAAATCCAGATAAGAAAAAAAACATTTTCTGTGGTCCC 
G GC AT GTTTCTTCTT CAT C TAT CAT TAT T T T GAT AC G G AT AAG T AAAAAT C GG C T GAG T AAAAC AT 
C C G G G TAAG T AAAAT GAT T T T C GAG G T C T C T T CAT C G G AT AAG TAAG AT AC AC AAG T GAT CAT T C C 
AAT AAAC AC T AAC T GAT G C AAC AC AAT AC C AG C G C AC AG T G T T T T C AC T AC GTTTGTTTGTATTGT 
AATTAACAATTAACACTTAAGTGTTTCCCAATGTGTCCGTGTGCAAACTGATTGGGACAAAGCTTG 
CAACAAGCCCGGCAATTCCATGTCGTTTATGTCTACGTTTGTTATTCTGACTGCTTGGAGGGGTTC 
GGAAAAAAATAAAAAACGGGTAAATAT TATAAAAAATTCACGGTGCCTTGAAATTTTAGGTGTCCG 
GAT T T C AC T G TAG AT GAT T AAT T T C T C AC T T G T AAAC AAAAG GAC C C C AG T AC C C T CAT T C G T GAC 
G T AC G T TAT AAAAT G T AAT T AT AAAAAG C C CAT TAT CAT G T TAT AC G T GAT CTTGNCTTG C AAT T A 
TNCTACCGCTTTCTTGATTTTTTAAAGCAATTTCTCCCTCTATGAACTTATTAACATAGCACTCCT 
GCAAAAGAAAACAGTCACTGCATGGATCCATATTGAATGTTGCTGCTTATTTCTCATTTTATTACT 
CACAGATATTTCAAGAACATCGTACTCTCTAACCAGGCTAAAGCAAAGAGGGTTACATTTTAGCCG 
AC AAG T T C AC TAG C T GAG T G G AAC AC G TAT AT AT T AAT G GAG AT GAC T C T G G T CAT GAT GAT TAG G 
AC AAT TAT CAT GAC GTTATCATT GAT CAT GAC CAT G T C AG TAT AAT AG AT AG C T AAC AAAT AAT G T 
AAT T AC T AAT TAT G AAGC AAT GGTGCATTTG C AG 

DOMAIN 1F-2 (2nd part of domain f) 

GTGGCATGCTGTATCCATGGAATGCCTACATTCCCCCACTGGCACAGACTCTACACCCTCCAATTT 
GAGCAAGCTCTAAGAAGACATGGCTCTAGTGTAGCAGTACCCTACTGGGACTGGACAAAGCCAATA 
CAT AAT AT T C C AC AT C T G T T C AC AG AC AAAG AAT AC T AC GAT G T C T G G AG AAAT AAAG T AAT G C C A 
AATCCATTTGCCCGAGGGTATGTCCCCTCACACGATACATACACGGTAAGAGACGTCCAAGAAGGC 
CTGTTCCACCTGACATCAACGGGTGAACACTCAGCGCTTCTGAATCAAGCTCTTTTGGCGCTGGAA 
CAGCACGACTACTGCGATTTTGCAGTCCAGTTTGAAGTCATGCACAACACAATCCATTACCTAGTG 
GGAGGACCTCAAGTCTATTCTTTGTCATCCCTTCATTATGCTTCATATGATCCGATCTTCTTCATA 
CACCACTCCTTTGTAGACAAGGTTTGGGCTGTCTGGCAGGCTCTTCAAGAAAAGAGAGGCCTTCCA 
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TCAGACCGTGCTGACTGCGCTGTTAGTCTGATGACTCAGAACATGAGGCCTTTCCATTACGAAATT 
AACCATAACCAGTTCACCAAGAAACATGCAGTTCCAAATGATGTTTTCAAGTACGAACTCCTGGGT 
T AC AG AT AC GAC AAT C T G G AAAT C G G T G GC AT G AAT T T G CAT G AAAT T GAAAAG G AAAT C AAAG AC 
AAACAGCACCATGTGAGAGTGTTTGCAGGGTTCCTCCTTCACGGAATTAGAACCTCAGCTGATGTC 
CAATTCCAGATTTGTAAAACATCAGAAGATTGTCACCATGGAGGCCAAATCTTCGTTCTTGGGGGG 
AC T AAAGAGAT GGCCTGGGCT T AT AAC C G T T TAT T C AAG T AC GAT AT T AC C CAT G C T C T T CAT GAC 
GCACACATCACTCCAGAAGACGTATTCCATCCCTCTGAACCATTCTTCATCAAGGTGTCAGTGACA 
GCCGTCAACGGAACAGTTCTTCCGGCTTCAATCCTGCATGCACCAACCATTATCTATGAACCTGGT 
CTCGGTG 

INTRON 1F-2/1G-1 (SEQ ID NO: 144) 

GTCTCGGT GAG T TAT T AAAAGAAAC AAAAT AT T T AC CAT T AC CAT T G T T AAC T AC AAAAAT GAG T G 
AG AT AT C T T AT AT C AC T G G T AC AC T AC T GAT AT T T T AT G C AAT G AAAT T AC TAT T T T T C C AG G T AC 
GCTTCAACCCCTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCATCATGCTTTTCTGT 
AAAACATAAAACACCAATTAACAATGTTCTTAGTGTGTTTGTTGACTCCCTTCCACTGCAACGCCT 
ACATAATCAAAGTGTTCGTTTTTTTCCAAACTTTCCAGTTAGTGTTGAAGACTAAAAAGTTAAATA 
AGCATTCACATAACTTCTAAGAGCAACTGGGACCATGCAGTTACGTATTGATATTTCTGTGAGAGT 
GAAG C AAAAC AC T G T T T T T C AAGC T TAG G T T T AT C AAT C AAAAT G T C C AAT AG T T CAT G T TAT C G A 
AAAGGCAGCGAAGGATAAGAGGCTCCGAGACATCTTGTCTATTCTCGTGTTCATATGATATCAACT 
GAG GAG C T T C C AT T AC AT T T T T GAC C T T AT CAT T T AAAG AC AT AC AT G GAAC AT T T T CAT T T T AC A 
GTTAAAGTGAACCACTTCAGGTTCAACTTCAACTTCGAATTCAACTTCTGTTGTGTGTTTTATGAG 
CCGACTGAAATAGAGTGCCTTACTTTCACTTCTAGTTTCGTTCTGTCTCGTCATCGTTGTTTCTTT 
CAG T G T GC AT AGT AC AC GC C TAG T AT AGAAC AC AC GAAC T T G T C C T T AC T TAAT AGAT T C T GAAAC 
TATTATGTGGAAAGTTGGCAGGCTATAGTAACATCCTGGCAAAATTATCATGTATCCTCTTGTTTG 
T CAT AAT TAG 

DOMAIN 1G-1 (1st part of domain g) 

ACCATCACGAAGATCATCATTCTTCTTCTATGGCTGGACATGGTGTCAGAAAGGAAATCAACACAC 
TTACCACTGCAGAGGTGGACAATCTCAAAGATGCCATGAGAGCCGTCATGGCAGACCACGGTCCAA 
ATGGATACCAGGCTATAGCAGCGTTCCATGGAAACCCACCAATGTGCCCTATGCCAGATGGAAAGA 
ATTACTCGTGTTGTACACATG 

INTRON G1-1/1G-2 (SEQ ID NO: 145) 

GTATGTATTTCCCACTGGTGGTCGCTGACTGCCAACACATACTTGTAATTTATTCATGAAAGTATA 
AT AG T T T G T T T GAAAG TAT AT T TAT AAC CAT C T T G C AC AAGC G T C AC GAAT T T T C AC C AC AAAGC T 
T C AAAAC G C C C AAAAC AT T C TAAT AG C GAT AT AT T T G T T AAAAG AC C AAAAT AT AGC C T T AC AAC A 
AT AGAT TAT T T TAATAAGAC CAG T CAG T GC AT GC AAAT C GAT TGGAAAC T T T GAAAT AAAATAT T C 
TATGTACTAACTGCCAATCTCATAATACTTGCCTTGGATGTGCTTCTTTTTCACATTCGCGTCGAG 
CT T CAAC TCCAATGCAT AAGC T T AAAAAT AAT C AT AAAC AC AAAC AAAT AG C C AC AG AG G C GAC GA 
TCCCTCCAGGCCAGGCTTTATTTGTCTCTTATAGAATATATCGCTATTAGAATGTTTTTGACGTTT 
TGAAGCTTTGTGGGTGAAAATTCGTGATGTTTATGCGTGGTATTTATGTAAGATGAAAATAAATAT 
AT C T T T T C AAAC AAGAT T T TAG TAT T T T GAAGAC T T C TAT GAAT AAAT T AC AC TTATGTGT T AGG T 
TATTGGTCACTGAGCGCTTGTGGTATTTTCCCTTCTTCAATTTGTTTGTTCTTTGTTCAATTTCGA 
AT AG T T AT C C T AC T G T G GAT AG T C TAT AT G AGAAT C G T T GAAAGAAT AAT AC AAT T C TAAT G GAT T 
GCAACTTCTTTAACTTTTATTTGCAACTGCCACGTTTCGGTATACGTTCTTATGCCGTCATCAAGC 
AT AC GAG T G T AC AT G T AT G C C AAAAC G C T G C AAAT AAAAAT T AAAGAAG T T G C AAT C CAT AAG AAT 
T T C AAT G T T C T T T CAT CAT C AC AT CAAC T T C T AAAAAT G C C TAT AAAAC AAT CAAC AAAC G T AC AA 
TAGTACATTACCGGATCTCGCAGCATGACCACGTCGATATCTAAACAATATCACTATCCATTAATA 
GGAT CAAG AG TAG G T AC AGAC AT G T T CAG T TAT AAAT AC T C T T C AAAAAAG T AGG GGAAC T T G GAA 
T T T C AAG G T C AAT AAC AAAC TAAT GAT AAT AAC AAT T G G T C C C AAAT AAT AAC AAT T G G T C C C AAA 
C TAAT T G TAT C T T T AC AAAG AAG AAAT T GAG T GAAC AAT T C AC C C G G TAT T T T AT T AC C T AAAC C G 
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TTTCTCTTGCTGTTATGGTGCGTGAAAGAAGAAATGGGTAAGAAACGGAAATTGACATTTTTGCGT 
CAGTGGTGCGTAATGCCCCCATTGTTGGCCAAACACTGATTGATTCGCTGAGGCATCGTGCATACG 
CGTCTACCTATGGTAATTTGATGCAGTCTGTCCCATTCTTCCACCAACGCCTGGACAAGTTCATCT 
AGCGTGGCTGGTGGCCTTTCACGTTGACGCACACGTCGGCCCAAGATGTCCCAGACATTTTCAATG 
GCCAGGGCTCATTGCTGGTCAGGGCATCCTATGGATATTGTGCCGTTGAAGGTGGTTATGTTGTTC 
ACATTGAAATTCCAAGTTCTCCTACTCTTTTTAAGAGGAGGTTCACAAAGTACGTTCTTTCATGTT 
GG T G AAG AGAAT AT C AAG G T C T T C T AAG G GAT T G T G T C T T AT AAT AT T T GAT T T T AAG AAG T T T G A 
TATTATCTGCATCCTTCCCAAGAAATTGCAAATGTTCACACACTATTGCGTTTGATAATGTTTTTG 
GGGAAAT AAAC TGT CCAGGAC T GC T AAATAGTAAT TAT T GC T AC T T T TAG 

DOMAIN 1G-2 (2nd part of domain g) 

GCATGGCTACTTTCCCCCACTGGCACAGACTGTACACAAAACAGATGGAAGATGCCTTGACCGCCC 
ATGGTGCCAGAGTCGGCCTTCCTTACTGGGACGGGACAACTGCCTTTACAGCTTTGCCAACTTTTG 
TCACAGATGAAGAGGACAATCCTTTCCATCAT 

INTRON 1G-2/1G-3 (SEQ ID NO: 146) 

GTGAGTTCACGTAAGCCTACGAGATCAACATTACTCCTTAACAGCCACGGCATCATGTACCGATAT 
ATCACAAACAAAAGTATTCAAAGCTTTAAACACGATATGTATGGTTCAAGAATGACATCATTAAAC 
AAGGACATGAGTCTGAAATAAACATGACTTGACACCGTTGTGGTCACAGTTTTGTTTCTCATTGGT 
GAACCTG TGAAACAACCTTTCAAACCAAAAGA TGCCTA TTAA TA TTGTTAA TTCCCA TGAA TTAGG 

AGATACACACATTCTACTGTCATTT AATAACCGCTTC 

CAGCATGAAAACACAATATGATTATCTCAATTCTACCATTACTAATTATAATTTTGACTGGCATTA 
TTTGACGACGCGTAAAACATCGCTGCTTTACAGACTGCACTGCGGTAACTGTGACGTTTTCATGAC 
GTCACTACATTCTATTCAAAACATTTCCACAGAAGAGCGAGACCACGGCCGTGATGGGTTCTGGGC 
AGATGATTACCCAAGTATATATTTATAATAACTTGACTGCTTGCCTGAATAATGTTGACACATGAC 
AACGAATTTGTGATAGCGTAAGAAGCGTGAATACTGTGAATAGTGTGAGGGGTGTTTGCTGAGAGT 
TAACCACCGTTAATTGCAAAATTCCCGAATACTTGCATTTGCAGTCGAAGAAGAATTGCATTCTTA 
CTCCTGTGAATGGACTCATTGTTATTTAGCAGCGGTTATTGAGGTTTTGATCACCTCTAAATAGAC 
AA TCAGGA TGCGGCAAACCGGAAAATTA TAGCAGAA TC TGTAA TTCAAGA TGGGCTTGCC TGTGAA 
AA TA TGCTGCGAGTTCAGTAACACTTTTCCCTTTCGA TCA TGGCCTGTTTTGCTCTGAA TCTGGTC 
TTTCAGAGGATCCCTGCTTTTTTAAAACTAAAGTCCTCCCAACTCACTTATATTTATGTTTTTTAA 
TTATTTATAGTTTTAATATGAACAACAAATCATATTTATTTACACATTATATTTTTCAG 

DOMAIN 1G-3 (3rd part of domain g) 

GGTCACATAGACTATTTGGGAGTGGATACAACTCGGTCGCCCCGAGACAAGTTGTTCAATGATCCA 
GAGCGAGGATCAGAATCGTTCTTCTACAGGCAGGTTCTCTTGGCTTTGGAGCAGACAGAT 
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Figure 9 

Primary structure of the KLH1 protein 

DOMAIN B 

GLPYWDWTEPMTHIPGLAGNKTYVDSHGASHTNPFHSSVIAFEENAPHTKRQIDQRLFKPATFGHH 
TDLFNQILYAFEQEDYCDFEVQFEITHNTIHAWTGGSEHFSMSSLHYTAFDPLFYFHHSNVDRLWA 
VWQALQMRRHKPYRAHCAISLEHMHLKPFAFSSPLNNNEKTHANAMPNKIYDYENVLHYTYEDLTF 
GGISLENIEKMIHENQQEDRIYAGFLLAGIRTSANVDIFIKTTDSVQHKAGTFAVLGGSKEMKWGF 
DRVFKFDITHVLKDLDLTADGDFEVTVDITEVDGTKLASSLIPHASVIREHARGKLNR 

DOMAIN C 

VKFDKVPRSRLIRKNVDRLSPEEMNELRKALALLKEDKSAGGFQQLGAFHGEPKWCPSPEASKKFA 
CCVHGMSVFPHWHRLLTVQSENALRRHGYDGALPYWDWTSPLNHLPELADHEKYVDPEDGVEKHNP 
WFDGHIDTVDKTTTRSVQNKLFEQPEFGHYTSIAKQVLLALEQDNFCDFEIQYEIAHNYIHALVGG 
AQPYGMASLRYTAFDPLFYLHHSNTDRIWAIWQALQKYRGKPYNVANCAVTSMREPLQPFGLSANI 
NTDHVTKEHSVPFNVFDYKTNFNYEYDTLEFNGLSISQLNKKLEAIKSQDRFFAGFLLSGFKKSSL 
VKFNICTDSSNCHPAGEFYLLGDENEMPWAYDRVFKYDITEKLHDLKLHAEDHFYIDYEVFDLKPA 
SLGKDLFKQPSVIHEPRI 

DOMAIN D 

GHHEGEVYQAEVTSANRIRKNIENLSLGELESLRAAFLEIENDGTYESIAKFHGSPGLCQLNGNPI 
SCCVHGMPTFPHWHRLYVWVENALLKKGSSVAVPYWDWTKRIEHLPHLISDATYYNSRQHHYETN 
PFHHGKITHENEITTRDPKDSLFHSDYFYEQVLYALEQDNFCDFEIQLEILHNALHSLLGGKGKYS 
MSNLDYAAFDPVFFLHHATTDRIWAIWQDLQRFRKRPYREANCAIQLMHTPLQPFDKSDNNDEATK 
THATPHDGFEYQNSFGYAYDNLELNHYSIPQLDHMLQERKRHDRVFAGFLLHNIGTSADGHVFVCL 
PTGEHTKDCSHEAGMFS ILGGQTEMS FVFDRLYKLDI TKALKKNGVHLQGDFDLE IE I TAVNGSHL 
DSHVIHSPTILFEAG 

DOMAIN E 

TDSAHTDDGHTEPVMIRKDITQLDKRQQLSLVKALESMKADHSSDGFQAIAS FHALPPLCPSPAAS 
KRFACCVHGMATFPQWHRLYTVQFQDSLRKHGAWGLPYWDWTLPRSELPELLTVSTIHDPETGRD 
IPNPFIGSKIEFEGENVHTKRDINRDRLFQGSTKTHHNWFIEQALLALEQTNYCDFEVQFEIMHNG 
VHTWVGGKEPYGIGHLHYASYDPLFYIHHSQTDRIWAIWQSLQRFRGLSGSEANCAVNLMKTPLKP 
FSFGAPYNLNDHTHDFSKPEDTFDYQKFGYIYDTLEFAGWSIRGIDHIVRNRQEHSRVFAGFLLEG 
FGTSATVDFQVCRTAGDCEDAGYFTVLGGEKEMPWAFDRLYKYDITETLDKMNLRHDEIFQIEVTI 
TSYDGTVLDSGLIPTPSIIYDPAH 

DOMAIN F 

HDISSHHLSLNKVRHDLSTLSERDIGSLKYALSSLQADTSADGFAAIASFHGLPAKCNDSHNNEVA 
CCIHGMPTFPHWHRLYTLQFEQALRRHGSSVAVPYWDWTKPIHNIPHLFTDKEYYDVWRNKVMPNP 
FARGYVPSHDTYTVRDVQEGLFHLTSTGEHSALLNQALLALEQHDYCDFAVQFEVMHNTIHYLVGG 
PQVYSLS S LHYAS YDPI FFIHHS FVDKVWAVWQALQEKRGLPSDRADCAVSLMTQNMRPFHYE INH 
NQFTKKHAVPNDVFKYELLGYRYDNLEIGGMNLHEIEKEIKDKQHHVRVFAGFLLHGIRTSADVQF 
QICKTSEDCHHGGQIFVLGGTKEMAWAYNRLFKYDITHALHDAHITPEDVFHPSEPFFIKVSVTAV 
NGTVLPAS I LHAPTI I YEPGLG 
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DOMAIN G 

DHHEDHHSSSMAGHGVRKEINTLTTAEVDNLKDAMRAVMADHGPNGYQAIAAFHGNPPMCPMPDGK 
NYSCCTHGMATFPHWHRLYTKQMEDALTAHGARVGLPYWDGTTAFTALPTFVTDEEDNPFHHGHID 
YLGVDTTRSPRDKLFNDPERGSESFFYRQVLLALEQTD 
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Figure 10 

Genomic sequence of the KLH2 gene 



DOMAIN 2B 

GGCCTGCCCTACTGGGATTGGACCATGCCAATGAGTCATTTGCCAGAACTGGCTACAAGTGAGACC 
TACCTCGATCCAGTTACTGGGGAAACTAAAAACAACCCTTTCCATCACGCCCAAGTGGCGTTTGAA 
AAT G G T G T AAC AAG C AG G AAT C C T GAT GC C AAAC T T T T TAT G AAAC C AAC T T AC G GAG AC C AC AC T 
T AC C T C T T C G AC AG CAT GAT C T AC G CAT T T GAG C AG GAAG AC T T C T G C G AC T T T G AAG T C C AAT AT 
GAGCTCACGCATAATGCAATACATGCATGGGTTGGAGGCAGTGAAAAGTATTCAATGTCTTCTCTT 
CACTACACTGCTTTTGATCCTATATTTTACCTCCATCACTCAAATGTTGATCGTCTCTGGGCCATT 
TGGCAAGCTCTTCAAATCAGGAGAGGCAAGTCTTACAAGGCCCACTGCGCCTCGTCTCAAGAAAGA 
G AAC CAT T AAAG CCTTTTGCATT C AG T T C C C C AC T G AAC AAC AAC G AG AAAAC G T AC C AC AAC T C T 
GTCCCCACTAACGTTTATGACTATGTGGGAGTTTTGCACTATCGATATGATGACCTTCAGTTTGGC 
G G TAT G AC CAT G T C AG AAC T T GAG G AAT AT AT T C AC AAG C AG AC AC AAC AT G AT AG AAC C T T T G C A 
G GAT TCTTCCTTT CAT AT AT T G G AAC AT C AGC AAG C G TAG AT AT C T T CAT C AAT C G AG AAG G T CAT 
GATAAATACAAAGTGGGAAGTTTTGTAGTACTTGGTGGATCCAAAGAAATGAAATGGGGCTTTGAT 
AGAATGTACAAGTATGAGATCACTGAGGCTCTGAAGACGCTGAATGTTGCAGTGGATGATGGGTTC 
AGCATTACTGTTGAGATCACCGATGTTGATGGATCTCCCCCATCTGCAGATCTCATTCCACCTCCT 
GCTATAATCTTTGACGTGGTCAGAG 

INTRON 2B/2C (SEQ ID NO: 147) 

GTATTTAAAAAAGTAATAAAACCATATTTTCGAATGCGCTTTATGAAATATCGTGTGACTGGTTCT 
T TAG T T T AC AT G GAG T G T AAC AAC AT G C T C C AT C AG T T G AC AT AT AC T G C T C AC AC AAAG T AAGG G 
AT AT T T GAT AAT GAT AAC AAAT AT AAT CAAAGCGGT TAT AC TAT C AAGAC T T AT T C AC AT AAT T AC 
AGGTGAAGGGAGGTGTGATCGTGTTCACTGATCAGGTTGAGGCCAGAGAAGTCCCAGTTTGAGTCT 
TGCAGAAGATGATGTTTAGGCATGGGGTCGAATCACCAAAATCACATGACTTCAATAACGGGTTGG 
ACCACCTCGAGCGACGATGCAAGCAGTAGAGCGTCTACGCATGCTCCTGATAAGGCGACCAATCTG 
TTCCTGGGGAATCAGTCGCCACTCCTCTTGTAGTGCCACGCTCATTTCTGCTACGGTCCTGGGTAC 
CTGCTATCGGGTCTTGATCCGTATCCCAAGGATGTCCCACACATGTTCAAGGTGAGAGGTCGGGGA 
ACATCGCTGGCCACGGTAAGGTCTGAATTTGATGCCGTTGAAAGTGAGCTCTGACAACCTGAGCAT 
GGTGAGCTCTGACGTTGTCGTCCTGAAAGATGAATCCAGCTCCATGACAGCGAGCAAAGGGCAGGA 
CGTGTTGGTCAATGCAGTTGTCTCTGCAGTACACACCTGTCACTCGCCACTCACAAGCGTGTAGAT 
C T G T AC G AC C AG T CAT G GAG AT C C C AG C C C AC AT CAT AAC G G AC C C C T AT C CAT AC C GAT CAT GAG 
CCACCATAGCAGCGTCTTGATGACGTTCTCCCTGTCGCCTCGACATCCTCACACGGCCAAAAGGAA 
CGTGGACTCGTCACTGAACATGACATTAGCCAACCTGGCACTTGTCCACCGCTGATGTTGGCGAGA 
CCATTCCAGTCGAGCTCTTCGGTGTCTGGCTTTCATCGATAACACGACGTAAGGTCTGCGGGCGTG 
CAAGACGGCTCTATGCAGGCGATTTCGGATTGTCTGGGTGCTAACTCTGATCCCAGGTGCCTGCTG 
AAGTTGATGCTGGATCTGTGTGGCATTGAGATGGCGATTCCTTAGGACTGTGGAGATGATGAATCG 
ATCTTGACTTATGGTGGTGACATTAGGACGTCGGGTTCGTGTCCTATCCTGCACTCTTCCAGTTGT 
TCGGTGACGCTCTGGTACCCGGCTGATTACTGACTGAGAATATCCATCTGCCGTGCGACATGAGCC 
TGTGTTGGCCCAGCCTGAAGCATTGCAATCGCCAGAGACGCTCTTCAAAAGTCATTCGACGCATGG 
TTTTCTGTTCACAAATGACAGCGTAAAACAGTTTTTGGTGCTTTTATGCTTCCCAAGAGCATGAAA 
AACACGTTCTATGGGTCGTGCACACCTTACATGACAAGTGTGAAAAGTGACTTGCACCCCCTTGTG 
TGTTCGGATGCACACTCTGTTTACGTACTGATGCGATTTGGCGTCTAAACATGTTTTGGCGTCTAA 
AC AT GTTTTCCTG CAT GAT T CAT AT AC TAT T T T G T CAT AT T C C T GGC AT CAAAC CAAAC T AC AG T G 
AAAT AT AT T T C AAT AT C C C C T AC T T T G T G T GAG TAG T AT AGAT C AC T GC AGAC AAC AT AT AG AC AA 
T G C AG T T AC AC C G T C AAC AAT C C C AG T CAT T AAT TAT GAT G AC AC T T C C AC AC AT AG T G T C AG T G A 
T T G T AAT T C AAC T G T AC AC AC TTTTCCCGT G AAC AT T C AG GAT C TAT AT G AC T AAAT AT AT AAC AT 
TAGTATACGTGCAGTTTTGTATCGCTACGACATTGTTGTAACTCTTTGTTTAATCATTTAACAG 
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DOMAIN 2C 

CTGATGCCAAAGACTTTGGCCATAGCAGAAAAATCAGGAAAGCCGTTGATTCTCTGACAGTCGAAG 
AACAAACTTCGTTGAGGCGAGCTATGGCAGATCTACAGGACGACAAAACATCAGGGGGTTTCCAGC 
AGATTGCAGCATTCCACGGAGAACCAAAATGGTGTCCAAGCCCCGAAGCGGAGAAAAAATTTGCAT 
GCTGTGTTCATGGAATGGCTGTTTTGCCTCACTGGCACAGATTGCTGACAGTTCAAGGAGAAAATG 
CTCTGAGGAAACATGGATTTACTGGTGGATTGCCCTATTGGGACTGGACTCGGCCAATGAGCGCCC 
TTCCACATTTTGTTGCTGATCCTACTTACAATGATTCTGTTTCCAGCCTCGAAGAAGATAACCCAT 
GGTATCATGGTCACATAGATTCTGTTGGGCATGATACTACAAGAGCTGTGCGTGATGATCTTTATC 
AATCTCCTGGTTTCGGTCACTACACAGATATTGCAAAACAAGTCCTTCTGGCCTTTGAGCAGGACG 
ATTTCTGTGATTTTGAGGTACAATTTGAAATTGCCCATAATTTCATACATGCTCTGGTTGGTGGTA 
AC G AAC CAT AC AG T AT G T CAT C T T T GAG G TAT AC T AC AT AC GAT C C AAT CTTCTTCTTG C AC C GC T 
CCAATACAGACCGACTTTGGGCCATTTGGCAAGCTTTGCAAAAATACCGGGGGAAACCATACAACA 
CTGCAAACTGTGCCATTGCATCCATGAGAAAACCACTTCAGCCATTTGGTCTTGATAGTGTCATAA 
ATCCAGATGACGAAACTCGTGAACATTCGGTTCCTTTCCGAGTCTTCGACTACAAGAACAACTTCG 
ACTATGAGTATGAGAGCCTGGCATTTAATGGTCTGTCTATTGCCCAACTGGACCGAGAGTTGCAGA 
GAAG AAAG T C AC AT G AC AG AG T C T T T G C AG GAT TCCTTCTT CAT GAAAT T G G AC AG T C T GC AC T C G 
T G AAAT T C T AC G T T T G C AAAC AC AAT G TAT C T GAC T G T G AC CAT TAT G C T G G AG AAT T C T AC AT T T 
TGGGAGATGAAGCTGAGATGCCTTGGAGGTATGACCGTGTGTACAAGTACGAGATAACACAGCAGC 
TGCACGATTTAGATCTACATGTTGGAGATAATTTCTTCCTTAAATATGAAGCCTTTGATCTGAATG 
GCGGAAGTCTTGGTGGAAGTATCTTTTCTCAGCCTTCGGTGATTTTCGAGCCAGCTGCAG 

INTRON 2C/2D (SEQ ID NO: 148) 

GTATGTTTTAAATGTCACTTATCCGTGATCTGTAATGAAGTTAGCAATTCACTTTATCAACTGTTT 
G GC T G T AC T G T T T C AG T GC GAG T T T T AC T TAG G T T G GAT T AAT T AAAAT AT T C AAG C T CAT AAAT G 
TTTTGATTCAACTTTTGTTATTTATTTCAAACAG 

DOMAIN 2D 

GTTCACACCAGGCTGATGAATATCGTGAGGCAGTAACAAGCGCTAGCCACATAAGAAAAAATATCC 
GG G AC C T C T C AG AGG G AGAAAT T G AG AGC AT C AGAT CTGCTTTCCTC C AAAT T C AAAAAG AG G G T A 
TAT AT GAAAAC AT T GC AAAG T T C C AT GGAAAAC C AGGAC T T T G T GAAC AT GAT GGAC AT CC T G T T G 
CTTGTTGTGTCCATGGCATGCCCACCTTTCCCCACTGGCACAGACTGTACGTTCTTCAGGTGGAGA 
ATGCGCTCTTAGAACGAGGGTCTGCAGTTGCTGTTCCTTACTGGGACTGGACCGAGAAAGCTGACT 
CTCTGCCAT CAT T AAT C AAT GAT G C AAC T T AT T T C AAT T C AC GAT C C C AG AC C T T T GAT C C T AAT C 
CTTTCTTCAGGGGACATATTGCCTTCGAGAATGCTGTGACGTCCAGAGATCCTCAGCCAGAACTAT 
GGGACAATAAGGACTTCTACGAGAATGTCATGCTGGCTCTTGAGCAAGACAACTTCTGTGACTTTG 
AGATTCAGCTTGAGCTGATACACAACGCCCTTCATTCTAGACTTGGAGGAAGGGCTAAATACTCCC 
TTTCGTCTCTTGATTATACCGCATTTGATCCTGTATTTTTCCTTCACCATGCAAACGTTGACAGAA 
TCTGGGCCATCTGGCAGGACTTGCAGAGATATAGAAAGAAACCATACAATGAGGCTGACTGCGCAG 
T C AAC GAG AT G C G T AAAC C T C T T C AAC CAT T T AAT AAC C C AG AAC T T AAC AG T GAT T C CAT GAC G C 
T T AAAC AC AAC C T C C C AC AAG AC AG T T T T GAT TAT C AAAAC C G C T T C AG G T AC C AAT AT GAT AAC C 
T T C AAT T T AAC C AC T T C AG C AT AC AAAAG C TAG AC C AAAC TAT T C AGG C T AG AAAAC AAC AC GAC A 
GAGTTTTTGCTGGCTTTATTCTTCACAACATTGGGACATCTGCTGTTGTAGATATTTATATTTGCG 
TTGAACAAGGAGGAGAACAAAACTGCAAGACAAAGGCGGGTTCCTTCACGATTCTGGGGGGAGAAA 
CAGAAATGCCATTCCACTTTGACCGCTTGTACAAATTTGACATAACGTCTGCTCTGCATAAACTTG 
GTGTTCCCTTGGACGGACATGGATTCGACATCAAAGTTGACGTCAGAGCTGTCAATGGATCGCATC 
TTGATCAACACATCCTCAACGAACCGAGTCTGCTTTTTGTTCCTGGTGAACGTAAGAATATATATT 
ATG 
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INTRON 2D/2E (SEQ ID NO: 149) 

GTTATAAAGCAGTATATTCTCTTCAAAAAAGTAGGGGAACTTGGAATTTCJyiGGTAAATAACATAA 
CTACCTTCAACGGCACAATATCCATATGATGCCCTGGCCAGCAATGAGGCGTGATCTTTTCCCCAT 
TAAAAATGTCTGGAACATCTTGGGCAAACGTGTGCGTCAACGTAAAACGCCACCAGTCACGCTAGA 
TGAACTTGTCCAGGCGTTGGTGGAAGAATGGGACAGACTGCATCAATTACCATAAGTAGACTCATT 
TGCAGCGAATCAGTCAGTGTTTGACCAATAACGGGGGCATTACGCACTACTGACGCAAAACAATGT 
CAATTTCCGTTTCTTACCCATTCCTTCTTTCACGGACCATAACAGCAAGAGAAACTGNTTAGGTAA 
TGAAATACCGGTGAATTATTGTTAACTGGATTCCTTCTTTGTAAAGATACAATTAGTTTGGGACCA 
ATTATTATTATCATTAGTTTGTTATTGACCTTGAAATTCGAAGTTCCTCTACATTTTTTAAGGAGT 
T T AT T T GAT T GAC AAT G AAAT G T AAGAAAAG AG C AAAT C G T AAAAT AC G T T AAAAAT TAT T C C T T A 
AACATCAGTCTCTAACTTCAGTTTAAATTGCCAGTAACACGTGTTATATGATGTTTCCGTTTCTCT 
TTGTTTTT TAG CAT T C AAC T T AT T T G AT AT AAC G T T T T AC T G T T T TAG AT T C AC AT C AAAC T G C AG 

DOMAIN 2E 

ATGGGC T T TCACAACATAATC T T GTGCGAAAAGAAGTAAGCTC TCT TACAACAC TGGAGAAACAT T 
TTTTGAGGAAAGCTCTCAAGAACATGCAAGCAGATGATTCTCCAGACGGATATCAAGCTATTGCTT 
CTTTCCACGCTTTGCCTCCTCTTTGTCCAAGTCCATCTGCTGCACATAGACACGCTTGTTGCCTCC 
ATGGTATGGCTACCTTCCCTCAGTGGCACAGACTCTACACAGTTCAGTTCGAAGATTCTTTGAAA.C 
GACATGGTTCTATTGTCGGACTTCCATATTGGGATTGGCTGAAACCGCAGTCTGCACTCCCTGATT 
TGGTGACACAGGAGACATACGAGCACCTGTTTTCACACAAAACCTTCCCAAATCCGTTCCTCAAGG 
CAAATATAGAATTTGAGGGAGAGGGAGTAACAACAGAGAGGGATGTTGATGCTGAACACCTCTTTG 
CAAAAGGAAA.TCTGGTTTACAACAACTGGTTTTGCAATCAGGCACTATATGCACTAGAACAAGAAA 
ATTACTGTGACTTTGAAATACAGTTCGAAATTTTGCATAATGGAATTCATTCATGGGTTGGAGGAT 
C AAAGAC C C AT T C AAT AG G T CAT C T T C AT T AC G CAT CAT AC GAT C C AC TGTTCTATATC C AC CAT T 
CGCAGACAGATCGCATTTGGGCTATCTGGCAAGCTCTCCAGGAGCACAGAGGTCTTTCAGGGAAGG 
AAGCACACTGCGCCCTGGAGCAAATGAAAGACCCTCTCAAACCTTTCAGCTTTGGAAGTCCCTATA 
AT T T GAAC AAAC G C AC T C AAG AG T T C T C C AAG C C T GAAG AC AC AT T T GAT TAT C AC C GAT T C G G G T 
ATGAGTATGATTCCCTCGAATTTGTTGGCATGTCTGTTTCAAGTTTACATAACTATATAAAACAAC 
AAC AG GAAG C T G AT AGAG T C T T C G C AG GAT T C C T T C T T AAAC GAT T T G GAC AAT C AGC AT C C G TAT 
CGTTTGATATCTGCAGACCAGACCAGAGTTGCCAAGAAGCTGGATACTTCTCAGTTCTCGGTGGAA 
GTTCAGAAATGCCGTGGCAGTTTGACAGGCTTTACAAGTACGACATTACAAAAACGTTGAAAGACA 
TGAAACTGCGATACGATGACACATTTACCATCAAGGTTCACATAAAGGATATAGCTGGAGCTGAGT 
TGGACAGCGATCTGATTCCAACTCCTTCTGTTCTCCTTGAAGAAGGAAAGC 

INTRON 2E/2F (SEQ ID NO: 150) 

GTATGTATCT CAT G T T T C T C AAAT AAT T T GAT T T T C AAT G C C C T T AC T AT AAAGC AC AG T TAT T G T 
T C AG T G C C AG T AAC CGTTTATT T AC G T AAAT G T T AC AG G C T AT TAT AAT C AAAAAT AC AT T AC C G A 
TATTGTTTACCACACAATTATATCATTGTCAAAATCTACCCCCATTACCTGCGTTTTGAATTTGTA 
AC C T T C T GAC AAAAAT GAAT T AGC AAGAGC T C T GAT GAAGAAC AT AAT GAAC AAC ACC TAT C T T T C 
TTCT T TCAATGACGGT T T AAC AAT AC AAT G C AC AAT G T AAAAAAAT AT AT AT AT AT AT AT AAT TT T 
AT AT C T AC AG T T AAT GC AAAT GAC T CC AC T AAT T C AGGGAAAC AC AT T T T C AG 

DOMAIN 2F-1 (1st part of domain f) 

ATGGGATCAATGTACGTCACGTTGGTCGTAATCGGATTCGTATGGAACTATCTGAACTCACCGAGA 
GAGATCTCGCCAGCCTGAAATCTGCAATGAGGTCTCTACAAGCTGACGATGGGGTGAACGGTTATC 
AAGCCATTGCATCATTCCACGGTCTCCCGGCTTCTTGTCATGATGATGAGGGACATGAG 
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INTRON 2F (SEQ ID NO: 151) 

GTAAAATAAAACGTCCAGTCATCGGAAACCCGCCCAGATATATGGGTTTTTTTCTATTTAAACAAA 
AAAGC AGAG AC AAAAAG AT TAT T AAAAG T C AC AT T T AAC T T GAT AT C AG AT C AAT AG T T T.G GC T AG 
TTAGTGCTCTATATCCCTCAAATCCTTCGAATCTTTAAGCCTCGTGATATTTTGACAAACAGAGAA 
GACTTAGTAGCCCAGACTTTCCCTTATTTTTTCCTGAAAATCTTAATACGGATATTAAATGGATTC 
AT T C T G C AAC C T AC AAC CAT AG C C CAT AT G T T AT TAT T T C AG 

DOMAIN 2F-2 (2nd part of domain f) 

ATTGCCTGTTGTATCCACGGAATGCCAGTATTCCCACACTGGCACAGGCTTTACACCCTGCAAATG 
GACATGGCTCTGTTATCTCACGGATCTGCTGTTGCTATTCCATACTGGGACTGGACCAAACCTATC 
AGCAAACTGCCTGATCTCTTCACCAGCCCTGAATATTACGATCCTTGGAGGGATGCAGTTGTCAAT 
AATCCATTTGCTAAAGGCTACATTAAATCCGAGGACGCTTACACGGTTAGGGATCCTCAGGACATT 
TTGTACCACTTGCAGGACGAAACGGGAACATCTGTTTTGTTAGATCAAACTCTTTTAGCCTTAGAG 
CAGACAGATTTCTGTGATTTTGAGGTTCAATTTGAGGTCGTCCATAATGCTATTCACTACTTGGTG 
GGTGGTCGACAAGTTTATGCTCTTTCTTCTCAACACTATGCTTCATATGACCCAGCCTTCTTTATT 
CATCACTCCTTTGTTGACAAAATATGGGCAGTCTGGCAAGCTCTGCAAAAGAAGAGAAAGCGTCCC 
TATCATAAAGCGGATTGTGCTCTTAACATGATGACCAAA.CCAATGCGACCATTTGCACACGATTTC 
AATCACAATGGATTCACAAAAATGCACGCAGTCCCCAACACTCTATTTGACTTTCAGGACCTTTTC 
T AC AC G T AT G AC AAC T T AGAAAT T G C T G G CAT G AAT G T T AAT C AG T T G G AAG C G G AAAT C AAC C GG 
CGAAAAAGCCAAACAAGAGTCTTTGCCGGGTTCCTTCTACATGGCATTGGAAGATCAGCTGATGTA 
CGATTTTGGATTTGCAAGACAGCTGACGACTGCCACGCATCTGGCATGATCTTTATCTTAGGAGGT 
T C T AAAG AG AT G C AC T G G G C C TAT G AC AG GAAC T T T AAAT AC G AC AT C AC C C AAG C T T T G AAG GC T 
CAGTCCATACACCCTGAAGATGTGTTTGACACTGATGCTCCTTTCTTCATTAAAGTGGAGGTCCAT 
GGTGTAAACAAGACTGCTCTCCCATCTTCAGCTATCCCAGCACCTACTATAATCTACTCAGCTGGT 
GAAG 

INTRON 2F-2/2G (SEQ ID NO: 152) 

GTGAGAGAAACTATAATAGTGTATGTCGGCAAAAAATGTGCTCATATCATGACTCTGTTGGCCGGT 
GGTTGCTCTCCTCTCCTCCTCCACCACCACCGGTACCTCCACCTGTCAGGGCATCAATGTACCATG 
AAAATGTCTACAATACTAGGCCTCCTGTAGAAGCACGTAAGATTTACATGGCCGGTTTGTAACTAG 
TTTAAAGTGCTTCACAGTAACCAAAACCAGTCTCTAAAGATTAATGTCTGTTTAAAATTTAATGCC 
AC AT T T T C AAC T GAC AT AT T C T T GC AAT T AAG T AC AAAT GAAG T AGT AT AAAT TAT C C ACAAAT AG 
CGTGATGCACCACAAATATAAACCGAGTGCTTTTTTGGCATTCCCCACTTGTTCTGGCATGATCAC 
ATCATAGATCTCGTTCATGAAGATACTGTTGGATGCTTTTTCCCAATATGCCCCAATCTGTTAAAT 
TATTTACACGACCGCAGTGTGTACTTTCATCACTCAGATCTTTACAATGTGTTTGTAACGTTTACA 
ATTAGCGTTATGATTGAAATATTACCCCCTGCTACGTTAAATCACATTCACTCACTCATCTGATGT 
AC T T T AC AG G T CAT AC C GAT GAT C AC G G C T C AG 

DOMAIN 2G-1 (1st part of domain g) 

ATCATATTGCTGGCAGTGGAGTCAGGAAAGACGTGACGTCTCTTACCGCATCTGAGATAGAGAACC 
TGAGGCATGCTCTGCAAAGCGTGATGGATGATGATGGACCCAATGGATTCCAGGCAATTGCTGCTT 
ATCACGGAAGTCCTCCCATGTGTCACATGCCTGATGGTAGAGACGTTGCATGTTGTACTCATG 

INTRON 2G-1/2G-2 (SEQ ID NO: 153) 

GTCAGTATTCTCCAATATGTTTGACTAGTGTCTTGCTCATGTATCAACTATTTTAGGCAA.CGTTTT 
T GAT T G T TAT G G TAT T T T CAT GAT AT GAT T T TAT T G C T AC C T C TAT AC C C AAAC AAAAAT G T T T T A 
TCAACAATTGTTTGAGTTTTAATGCAAGAAAATTATCAGGAGTAGCGTGCAAAAATGACTGGAAGG 
CATGGTGTACTTCTGTGTGTACATACAAGTGGGTAATGCCTTATTGAACTCGTAATCACTCGTTTC 
AG 
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DOMAIN 2G-2 (2nd part of domain g) 

GAATGGCATCTTTCCCTCACTGGCACAGACTGTTTGTGAAACAGATGGAGGATGCACTGGCTGCGC 
ATGGAGCTCACATTGGCATACCATACTGGGATTGGACAAGTGCGTTTAGTCATCTGCCTGCCCTAG 
T GAC T GAC C AC GAG C AC AAT C C C T T C C AC C AC 

INTRON 2G-2/2G-3 (SEQ ID NO: 154) 

GTCAGTATTCTCCAATATGTTTGACTAGTGTCTTGCTCATGTATCAACTATTTTAGGCAACGTTTT 
T GAT T G T TAT GGTATTTT CAT GAT AT GAT T T TAT T GC T AC C T C TAT AC C C AAAC AAAAAT G T T T T A 
TCAACAATTGTTTGAGTTTTAATGCAAGAAAATTATCAGGAGTAGCGTGCAAAAATGACTGGAAGG 
CATGGTGTACTTCTGTGTGTACATACAAGTGGGTAATGCCTTATTGAACTCGTAATCACTCGTTTC 
AG 

DOMAIN 2G-3 (3rd part of domain g) 

GGACATATTGCTCATCGGAATGTGGATACATCTCGATCTCCGAGAGACATGCTGTTCAATGACCCC 
GAACACGGGTCAGAATCATTCTTCTATAGACAGGTTCTCTTGGCTCTAGAACAGACAGACTTCTGC 
C AAT T T G AAG T T C AG T T T G AAAT AAC AC AC AAT G C AAT C C AC T C T T G GAC T G GAG GAC AT AC T C C A 
TAT G G AAT G T CAT C AC T G GAAT AT AC AG CAT AT GAT C C AC T C T T T TAT C T C C AC CAT T C C AAC AC T 
GATCGTATCTGGGCCATCTGGCAGGCACTCCAGAAATACAGAGGTTTTCAATACAACGCAGCTCAT 
TGCGATATCCAGGTTCTGAAACAACCTCTTAAACCATTCAGCGAGTCCAGGAATCCAAACCCAGTC 
AC C AG AGC C AAT T C TAG G G C AG T C GAT T CAT T T GAT TAT GAGAG AC T C AAT TAT C AAT AT GAC AC A 
C T T AC C T T C C AC G GAC AT T C T AT C T C AGAAC T T GAT GC C AT GC T T C AAG AG AG AAAG AAG GAAGAG 
AGAACATTTGCAGCCTTCCTGTTGCACGGATTTGGCGCCAGTGCTGATGTTTCGTTTGATGTCTGC 
ACACCTGATGGTCATTGTGCCTTTGCTGGAACCTTCGCGGTACTTGGTGGGGAGCTTGAGATGCCC 
TGGTCCTTTGAAAGATTGTTCCGTTACGATATCACAAAGGTTCTCAAGCAGATGAATCTTCACTAT 
GAT T C T GAG T T C C AC T T T GAG T T G AAG AT T G T T G G C AC AG AT G GAAC AGAAC T GC CAT C G GAT C G T 
AT C AAGAGC C C T AC CAT T GAAC AC C AT GGAG GAG 

INTRON 2G/2H (SEQ ID NO: 155) 

GTATGTTTTGAGATCCACATAATCTTCTACCCTGTCTCATTTCTAATGCTCTTCAATACACAATTT 
ATATAGCCTTTGAGCTTCAGATGTATTACGGACAGGCATTACAGTATACATGTAATATGGTTTTCT 
GCTATTTGCAAAAATTGTGTCCTATCTCTGTTCAGATCATCATGGCGGTGACACCTAG 

DOMAIN 2H (SEQ ID NO: 159) 

GTCACGATCACAGTGAACGTCACGATGGATTTTTCAGGAAGGAAGTCGGTTCCCTGTCCCTGGATG 
AAGCCAATGACCTTAAAAATGCACTGTACAAGCTGCAGAATGATCAGGGTCCCAATGGATATGAAT 
CAATAGCCGGTTACCATGGCTATCCATTCCTCTGCCCTGAACATGGTGAAGACCAGTACGCATGCT 
GTGTCCACGGAATGCCTGTATTTCCACATTGGCACAGACTTCATACAATCCAGTTTGAGAGAGCTC 
TCAAAGAACATGGTTCTCATTTGGGTCTGCCATACTGGGACTGGAC 
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Figure 11 

Primary structure of the KLH2 protein 

DOMAIN B 

GLPYWDWTMPMSHLPELATSETYLDPVTGETKNNPFHHAQVAFENGVTSRNPDAKLFMKPTYGDHT 
YLFDSMIYAFEQEDFCDFEVQYELTHNAIHAWVGGSEKYSMSSLHYTAFDPIFYLHHSNVDRLWAI 
WQALQIRRGKSYKAHCASSQEREPLKPFAFSSPLNNNEKTYHNSVPTNVYDYVGVLHYRYDDLQFG 
GMTMSELEEYIHKQTQHDRTFAGFFLSYIGTSASVDIFINREGHDKYKVGSFWLGGSKEMKWGFD 
RMYKYE I TEALKTLNVAVDDGFS I TVE I TDVDGS PPSADL I PPPAI I FDWR 

DOMAIN C 

ADAKDFGHSRKIRKAVDSLTVEEQTSLRRAMADLQDDKTSGGFQQIAAFHGEPKWCPSPEAEKKFA 
CCVHGMAVFPHWHRLLTVQGENALRKHGFTGGLPYWDWTRPMSALPHFVADPTYNDSVSSLEEDNP 
WYHGHIDSVGHDTTRAVRDDLYQSPGFGHYTDIAKQVLLAFEQDDFCDFEVQFEIAHNFIHALVGG 
NEPYSMSSLRYTTYDPIFFLHRSNTDRLWAIWQALQKYRGKPYNTANCAIASMRKPLQPFGLDSVI 
NPDDETREHSVPFRVFDYKNNFDYEYESLAFNGLSIAQLDRELQRRKSHDRVFAGFLLHEIGQSAL 
VKFYVCKHNVSDCDHYAGEFYILGDEAEMPWRYDRVYKYEITQQLHDLDLHVGDNFFLKYEAFDLN 
GGSLGGSIFSQPSVIFEPAA 

DOMAIN D 

GSHQADEYREAVTSASHIRKNIRDLSEGEIESIRSAFLQIQKEGIYENIAKFHGKPGLCEHDGHPV 
ACCVHGMPTFPHWHRLYVLQVENALLERGSAVAVPYWDWTEKADSLPSLINDATYFNSRSQTFDPN 
PFFRGHIAFENAVTSRDPQPELWDNKDFYENVMLALEQDNFCDFEIQLELIHNALHSRLGGRAKYS 
LSSLDYTAFDPVFFLHHANVDRIWAIWQDLQRYRKKPYNEADCAVNEMRKPLQPFNNPELNSDSMT 
LKHNL PQDS FD YQNRFRYQYDNLQ FNHFS I QKLDQT I QARKQHDRVFAGF I LHN I GTSAWD I Y I C 
VEQGGEQNCKTKAGSFTILGGETEMPFHFDRLYKFDITSALHKLGVPLDGHGFDIKVDVRAVNGSH 
LDQH I LNE P S LLFVPGERKNI YY 

DOMAIN E 

DGLSQHNLVRKEVSSLTTLEKHFLRKALKNMQADDSPDGYQAIASFI1ALPPLCPSPSAAHRHACCL 
HGMATFPQWHRLYTVQFEDSLKRHGSIVGLPYWDWLKPQSALPDLVTQETYEHLFSHKTFPNPFLK 
ANIEFEGEGVTTERDVDAEHLFAKGNLVYNNWFCNQALYALEQENYCDFEIQFEILHNGIHSWVGG 
SKTHSIGHLHYASYDPLFYIHHSQTDRIWAIWQALQEHRGLSGKEAHCALEQMKDPLKPFSFGSPY 
NLNKRTQEFSKPEDTFDYHRFGYEYDSLEFVGMSVSSLHNYIKQQQEADRVFAGFLLKGFGQSASV 
SFDICRPDQSCQEAGYFSVLGGSSEMPWQFDRLYKYDITKTLKDMKLRYDDTFTIKVHIKDIAGAE 
LDSDLIPTPSVLLEEGK 

DOMAIN F 

HGINVRHVGRNRIRMELSELTERDLASLKSAMRSLQADDGVNGYQAIASFHGLPASCHDDEGHEIA 
CCIHGMPVFPHWHRLYTLQMDMALLSHGSAVAIPYWDWTKPISKLPDLFTSPEYYDPWRDAWNNP 
FAKGY I KS EDAYTVRD PQD I L YHLQDETGTS VLLDQTLLALEQTDF CDFE VQFE WHNAI HYLVGG 
RQVYALS SQHYAS YDPAFF I HHS FVDKI WAVWQALQKKRKRPYHKADCALNMMTKPMRPFAHDFNH 
NGFTKMHAVPNTLFDFQDLFYTYDNLEIAGMNVNQLEAEINRRKSQTRVFAGFLLHGIGRSADVRF 
WI CKTADDCHASGMI F I LGG S KEMHWAYDRNF KYD I TQALKAQ S I HPEDVFDTDAPFF I KVEVHGV 
NKTALPSSAI PAPTI I YSAGE 

DOMAIN G 

DHI AGSGVRKDVTSLTASE I ENLRHALQSVMDDDGPNGFQAI AAYHGS PPMCHMPDGRDVACCTHG 
MAS F PHWHRL F VKQMEDALAAHGAH I G I P YWDWT S AF SHL PALVTDHEHNP FHHGH I AHRNVDT S R 
SPRDMLFNDPEHGSESFFYRQVLLALEQTDFCQFEVQFEITHNAIHSWTGGHTPYGMSSLEYTAYD 
PLFYLHHSNTDRIWAIWQALQKYRGFQYNAAHCDIQVLKQPLKPFSESRNPNPVTRANSRAVDSFD 
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YERLNYQYDTLTFHGHSISELDAMLQERKKEERTFAAFLLHGFGASADVSFDVCTPDGHCAFAGTF 
AVLGGELEMPWSFERLFRYDITKVLKQMNLHYDSEFHFELKIVGTDGTELPSDRIKSPTIEHHGG 

DOMAIN H (SEP ID NO: 158) 

GHDHSERHDGFFRKEVGSLSLDEANDLKNALYKLQNDQGPNGYESIAGYHGYPFLCPEHGEDQYAC 
CVHGM P VF PHWHRLHT I QFERAL KEHGS HLGL P YWDW 



